PHP - When is a Quote not a Quote

19 April, 2024

Most of us who code in PHP are well aware of the difference between encasing a string in single quotes (apostrophes) and double quotes (quotation marks) in an echo or print statement.

For example:

$x = “ not “;

echo ‘You are $x going tomorrow.’;

 

>> You are $x going tomorrow.

 

echo “You are $x going tomorrow.”;
>> You are not going tomorrow.

The double quotes cause the parser to inspect the string before rendering it.

Sometimes, however, there are unexpected consequences to using one instead of the other. Take a look at the following regex pattern:

/<(title|h1|h2|h3|h4|h5|ul|ol|p|figure|caption|span)(.*?)><\/(\1)>/

Given the following text:
 

<title></title><p class='first'></p><span class=‘blue’></span>

an online regext tester will return 3 matches. Using this pattern in PHP:

$str = "<title></title><p class='first'></p><span class='blue'></span>";

$tags = "title|h1|h2|h3|h4|h5|ul|ol|p|figure|caption|span";

preg_match_all("/<($tags)(.*?)><\/(\1)>/", $str);

returns 0 instead of 3. Why? Feel free to bang your head against a hard surface for awhile…I did, until Jakumi on StackOverflow saved me from a concussion.

It's because of the double quotes. They cause the inspection of the string contents, which results in the \1 being interpreted as an octal value.

The solution, still using double quotes, would be:

preg_match_all("/<($tags)(.*?)><\/(\\1)>/", $str);

Note the extra backslash to escape the one preceding the 1. The other option is to use single quotes:

preg_match_all('/<(' . $tags . ')(.*?)><\/(\1)>/', $str);

 

Login or Register to Comment!