Hyphen Like You Just Don’t Care

I recently went to show a colleague something on my blog and horror of horrors, saw this:

The nasty, jarring, errors.

My first thought was how long had it been like this and my second of course, turned to fixing it. Sadly I didn’t have my credentials with me so it had to wait till later, but when I was able to, I looked into what caused this and why.

The Crayon Syntax Highlighter is a plugin I use to highlight bits of code and to be fair, it has done a great job, but what had suddenly caused this to go wrong? Let’s pick up the breadcrumbs at the file and location mentioned: crayon_langs.class.php and line 340.

Digging in there, we had this line:

Everything looked OK so a quick Google and I learned that this was a known issue having been written about by App Shah and punk-t – thanks guys! Basically, it’s all down to that tricky hyphen and in short, we need to escape it with a backward slash like this:

That fixed my problem but I always want to know the underlying “why” so I kept hunting.

If we plug both of these into regex101.com, what do we see?

Without the backward slash

And with the backslash?

With backward slash

Other than giving the hyphen a line of its own, it doesn’t really say much. How about the PHP docs?

The backslash character has several uses. Firstly, if it is followed by a non-alphanumeric character, it takes away any special meaning that character may have. This use of backslash as an escape character applies both inside and outside character classes.

Again, nothing we didn’t know – the backslash is well known for escaping characters. What about the different releases for PHP? Well, we can see exactly when this happened by looking at this fantastic on-line tool: 3v4l.org. With this, we can take a piece of PHP and run it against all compiled releases of PHP and capture any output. To save you some time, I have done that already here but if you like to try yourself, just paste this into the window and click the big blue “eval();” button:

As you can see, it fails when it reaches PHP 7.30.

Trying to find out what the direct cause of this took a little hunting, but I found it here.

Backward Incompatible Changes
Some behavior change can be sighted with invalid patterns…
The userland code is unaffected, whereby the pattern checking is done more precise in PCRE2.

So that’s it. PCRE are the Perl Compatible Regular Expressions and control how those pattern matching strings work. They are now more precisely interpreted, causing the sloppier way of using the hyphen to fail. By the way, in case you are interested, another solution to this would have been to place the hyphen just before the closing square bracket or sometimes directly after ranges like this: “a-c-“.

Still with me? Be mindful of upgrading PHP is (my personal) lesson for today!


Written by Stephen Moon
email: stephen at logicalmoon.com
www: https://www.logicalmoon.com

Hey! Did you enjoy reading this? If you did and would like an email when I add new content, just subscribe to my list. You can unsubscribe at any time.

Leave a Reply

Your email address will not be published. Required fields are marked *