Home » Php » php – Regex to remove ALL single characters from a string

php – Regex to remove ALL single characters from a string

Posted by: admin July 12, 2020 Leave a comment

Questions:

I need a Regular Expression to remove ALL single characters from a string, not just single letters or numbers

The string is:

“A Future Ft Casino Karate Chop ( Prod By Metro )”

it should come out as:

“Future Ft Casino Karate Chop Prod By Metro”

The expression I am using at the moment (in PHP), correctly removes the single ‘A’ but leaves the single ‘(‘ and ‘)’

This is the code I am using:

$string = preg_replace('/\b\w\b\s?/', '', $string); 
How to&Answers:

Try this:

(^| ).( |$)

Breakdown:

   1.  (^| )  ->  Beginning of line or space  
   2.  .      ->  Any character  
   3.  ( |$)  ->  Space or End of line

Actual code:

$string = preg_replace('/(^| ).( |$)/', '$1', $string); 

Note: I’m not familiar with the workings of PHP regex, so the code might need a slight tweak depending on how the actual regex needs declared.

As m.buettner pointed out, there will be a trailing white space here with this code. A trim would be needed to clear it out.

Edit: Arnis Juraga pointed out that this would not clear out multiple single characters a b c would filter out to b. If this is an issues use this regex:

(^| ).(( ).)*( |$)

The (( ).)* added to the middle will look for any space following by any character 0 or more times. The downside is this will end up with double spaces where a series of single characters were located.

Meaning this:

The a b c dog

Will become this:

The  dog

After performing the replacement to get single individual characters, you would need to use the following regex to locate the double spaces, then replace with a single space

( ){2}

Answer:

A slightly more efficient version that does not require capturing would be using lookarounds. It’s a bit less intuitive due to the multiple negative logic:

$string = preg_replace('/(?<!\S).(?!\S)\s*/', '', $input);

This will remove any character that is neither preceded nor followed by a non-whitespace character (so only those that are between whitespace or at the string boundaries). It will also include all trailing whitespace in the match, so as to leave only the preceding whitespace if there is any. The caveat is, that just like Nick’s answer the ) at the end of the string will leave a trailing whitespace (because it is in front of the character). This can easily be solved by trimming the string.