Home » Php » php – What do the symbols mean in preg_match?

php – What do the symbols mean in preg_match?

Posted by: admin July 12, 2020 Leave a comment

Questions:

I have this expression in a code snippet i borrowed offline. It forces the new users to have a password that not only requires upper+lower+numbers but they must be in that order! If i enter lower+upper+numbers, it fails!

if (preg_match("/^.*(?=.{4,})(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z]).*$/", $pw_clean, $matches)) {

Ive searched online but can’t find a resource that tells me what some characters mean. I can see that the pattern is preg_match(“/some expression/”,yourstring,your match).

What do these mean:

1.  ^          -  ???
2.  .*         -  ???
3.  (?=.{4,})  -  requires 4 characters minimum
4.  (?.*[0-9]) -  requires it to have numbers
5.  (?=.*[a-z])-  requires it to have lowercase
6.  (?=.*[A-Z])-  requires it to have uppercase
7.  .*$        -  ???
How to&Answers:

Here are the direct answers. I kept them short because they won’t make sense without an understanding of regex. That understanding is best gained at regular-expressions.info. I advise you to also try out the regex helper tools listed there, they allow you to experiment – see live capturing/matching as you edit the pattern, very helpful.


1: The caret ^ is an anchor, it means “the start of the haystack/string/line”.

  • If a caret is the first symbol inside a character class [], it has a different meaning: It negates the class. (So in [^ab] the caret makes that class match anything which is not ab)

2: The dot . and the asterisk * serve two separate purposes:

  • The dot matches any single character except newline \n.
  • The asterisk says “allow zero or many of the preceeding type”.

When these two are combined as .* it basically reads “zero or more of anything until a newline or another rule comes into effect”.

7: The dollar $ is also an anchor like the caret, with the opposite function: “the end of the haystack”.


Edit:

Simple parentheses ( ) around something makes it a group. Here you have (?=) which is an assertion, specifically a positive look ahead assertion. All it does is check whether what’s inside actually exists forward from the current cursor position in the haystack. Still with me?
Example: foo(?=bar) matches foo only if followed by bar. bar is never matched, only foo is returned.

With this in mind, let’s dissect your regex:

/^.*(?=.{4,})(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z]).*$/

Reads as:
        ^.* From Start, capture 0-many of any character
  (?=.{4,}) if there are at least 4 of anything following this
(?=.*[0-9]) if there is: 0-many of any, ending with an integer following
(?=.*[a-z]) if there is: 0-many of any, ending with a lowercase letter following
(?=.*[A-Z]) if there is: 0-many of any, ending with an uppercase letter following
        .*$ 0-many of anything preceding the End

You say the order of password characters matter – it doesn’t in my tests. See test script below. Hope this cleared up a thing or two. If you are looking for another regex which is a bit more forgiving, see regex password validation

<pre>
<?php
// Only the last 3 fail, as they should. You claim the first does not work?
$subjects = array("aaB1", "Baa1", "1Baa", "1aaB", "aa1B", "aa11", "aaBB", "aB1");

foreach($subjects as $s)
{
    $res = preg_match("/^.*(?=.{4,})(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z]).*$/", $s, $matches);
    echo "result: ";
    print_r($res);

    echo "<br>";
    print_r($matches);
    echo "<hr>";
}

Excellent online tool for checking and testing Regular Expressions:
https://regex101.com/

Answer:

If you don’t know this site, you should go there immediately.

This is like the bible of regular expressions.

Regular-expressions.info

Answer:

To use regular expressions first you need to learn the syntax. This syntax consists in a series of letters, numbers, dots, hyphens and special signs, which we can group together using different parentheses.

Look at this link Getting Started with PHP Regular Expressions. An easy way to learn regular expressions.