Regex cheat sheet
regex reserved characters = must be escaped to be taken literally
^.[$()|*+?{\
escape character
\
how to represent a backslash?
A literal \
is created using \\
NOTE: In php regex, you need three or four backslashes to represent a single literal backslash.
To match Tests\ComProtocolTest
, you need the following regex /Tests\\\\ComProtocolTest/
.
quantifiers
? zero or one
* zero or more
+ one or more
{2} two
{2,} two or more
{2,5} two to five
examples
abc? ab followed by zero or one c
abc+ ab followed by one or more c
(phone){2} exactly 2 phone
a(bc){2,5} a followed by 2 up to 5 copies of the sequence bc
anchors
^ beginning of string
$ end of string
examples
roar matches any string that has the text roar in it
^The matches any string that starts with The
end$ matches a string that ends with end
^The end$ exact string match (starts and ends with The end)
OR operator
a[bcd] a followed by either b,c or d
a(?:one|two) a followed by one or two without capture
a(one|two) a followed by one or two with capture
range
[a-z] any lowercase character
characters
. match any character except for line terminators
\S match non-whitespace character
\s match whitespace character (includes tabs and newline). \s is short for [ \t\r\n\f].
\d match digit character. short for [0-9].
\w match word character. short for [a-zA-Z0-9_].
\D match non-digit character
\W match non-word character
\A match beginning of the whole string (as opposed to ^ matching the beginning of one line)
\Z match end of the whole string (as opposed to $ matching the end of one line)
\p{L} any letter character from any language
\p{M} marks (a character that is to be combined with another: accent, etc.)
match shortest string .*?
(.*?)
flags
g global does not return after the first match, restarting the subsequent searches from the end of the previous match
m multi-line when enabled ^ and $ will match the start and end of a line, instead of the whole string
i case-insensitive
/roar/i
grouping and capturing
everything between ()
is captured and can be reused with $
to avoid capturing use (?:notCaptured)
preg_replace('/(078)(652)(27)(36)/', '$4 $3 $2 $1', '0786522736');
look-ahead and look-behind
d(?=r) matches a d only if is followed by r, but r will not be part of the overall regex match
(?<=r)d matches a d only if is preceeded by an r, but r will not be part of the overall regex match
you can use also the negation operator
d(?!r) matches a d only if is not followed by r, but r will not be part of the overall regex match
(?<!r)d matches a d only if is not preceeded by an r, but r will not be part of the overall regex match
negation
The ^ (circumflex or caret) inside square brackets negates the expression
So to find a "foo" not preceeded by a "." would be:
[^.]foo
negate whole expression
(?!pattern)
take only lines not starting with url
^(?!url).*$
lines not matching x
^((?!drivers).)*$
unicode
https://www.regular-expressions.info/unicode.html
The PHP preg functions, which are based on PCRE, support Unicode when the /u option is appended to the regular expression.
reference
https://medium.com/factory-mind/regex-tutorial-a-simple-cheatsheet-by-examples-649dc1c3f285