Regular Expressions Syntax Basics
Overview
Regular expressions in Abyss Web Server conform to the PCRE syntax (Perl Compatible Regular Expressions). This articles is a quick guide to understand the basics of regular expressions. For an extensive description of their syntax, refer to the PCREPATTERN section in http://pcre.org/pcre.txt.
Syntax basics
When matching a string (a sequence of characters) with a regular expression, the following rules apply:
- . matches any character,
- * repeats the previous match zero or more times,
- + repeats the previous match one or more times,
- ? repeats the previous match zero or one time at most,
- {n,m} repeats the previous match n times at least and m times at most (n and m are positive integers),
- {n} repeats the previous match exactly n times,
- {n,} repeats the previous match n times at least,
- {,m} repeats the previous match m times at most,
- ^ is an anchor which matches with the beginning of a string,
- $ is an anchor which matches with the end of a string,
- [set] matches any character in the specified set,
- [^set] matches any character not in the specified set,
- \ suppresses the syntactic significance of a special character,
- (expression) groups the characters between the parentheses into a single unit and captures a match for later use as a backreference ($1, ... , $9).
A set is made of characters or ranges. A range is formed by two characters with a - in the middle (as in 0-9 or a-z).
Preceding a special character with \ makes it loose its syntactic significance and match that character exactly. Outside a set, the special characters are:
()[]{}.*+?^$\
Inside a set, the special characters are:
[]\-^
Examples of regular expressions
- abc
- Any string containing the substring abc matches with this regular expression.
- abcd*
- Any string containing the substring abc followed by zero or more d characters matches with this regular expression.
- abcd?
- Any string containing the substring abc or abcd matches with this regular expression.
- ab(cd)?
- Any string containing the substring ab or abcd matches with this regular expression.
- ^/dir
- Any string starting with the substring /dir matches with this regular expression.
- \.exe$
- Any string ending with the substring .exe matches with this regular expression. Note here that the dot character . has been escaped to remove its special meaning.
- ^/dir/.*\.exe$
- Any string beginning with /dir and ending with .exe matches with this regular expression.
- ^/dir/[^./]+\.exe$
- Any string starting with /dir followed by 1 or more characters except . and /, and followed by .exe matches with this regular expression.
Examples of backreferences
- /dir/test.exe matches with ^/dir/([^./]+)\.exe$. This regular expression has a single group ([^./]+) which defines the backreference $1. In this case, $1's value is test.
- /dir/test.exec matches with /dir/([^./]+)\.(exe.*)$. This regular expression has two groups: ([^./]+) which defines the backreference $1, and (exe.*) which defines the backreference $2. In this case, $1's value is test and $2's value is exec.
See also
- URL Rewriting TutorialArticle
- Converting mod_rewrite directivesArticle
- URL RewritingForum
Keep in touch with us
Sign up for our low volume newsletter to get product announcements, articles and power tips.