Syntax of Regular Expressions

SLXProfiler

previous page next page

Syntax of Regular Expressions

Regular Expressions are used to specify patterns of text for searches.

Simple Matches

Single characters match themselves unless they are metacharacters with special meaning. Characters that normally function as metacharacters or escape sequences can be interpreted literally by preceding them with a backslash "\".

Examples

userid	matches string 'userid'
\^UserID	matches '^UserID'

Escape Sequences

Characters can be specified using an escape sequence syntax similar to that used in C and Perl.

Supported escape sequences

\xnn	char with hex code nn
\x{nnnn}	char with hex code nnnn (one byte for plain text and two bytes for Unicode)
\t	tab (HT/TAB), same as \x09
\n	newline (NL), same as \x0a
\r	car.return (CR), same as \x0d
\f	form feed (FF), same as \x0c
\a	alarm (bell) (BEL), same as \x07
\e	escape (ESC), same as \x1b

Examples

user\x20id	matches 'user id' (note the space in the middle)
\tuserid	matches 'userid' predefined by tab

Character Classes

You can specify a character class by enclosing a list of characters in [] which will match any one character from the list.

If the first character after the "['' is "^'', the class matches any character not in the list.

Examples

user[aeiou]d	finds strings 'userad', 'usered' etc. but not 'userbd', 'usercd' etc.
user[^aeiou]d	find strings 'userbd', 'usercd' etc. but not 'userad', 'usered' etc.

The "-'' character is used to specify a range within a list. If you want "-'' itself to be a member of a class, put it at the start or end of the list or escape it with a backslash.

Examples

[-az]	matches 'a', 'z' and '-'
[az-]	matches 'a', 'z' and '-'
[a\-z]	matches 'a', 'z' and '-'
[a-z]	matches all twenty six small characters from 'a' to 'z'
[\n-\x0D]	matches any of the ASCII characters 10,11,12, or 13
[\d-t]	matches the digits '-' and 't'.
[]-a]	matches any character from ']' to 'a'.

Metacharacters

Metacharacters are special characters which are the essence of Regular Expressions. The different types of metacharacters are described below.

i	Used for case-insensitive pattern matching.
m	Treats a string as multiple lines.
s	Treats a string as a single line.
g	Used as a non-standard modifier. Switching it Off switches all following operators into non-greedy mode (by default this modifier is On). If modifier /g is Off then '+' works as '+?', '' as '?' etc.
x	Tells the regular expression to ignore whitespace that is neither backslashed nor within a character class. You can use this to break a regular expression into more readable parts.

^	start of line
$	end of line
\A	start of text
\Z	end of text
.	any character in line

^userid	matches string 'userid' only if it's at the beginning of line
userid$	matches string 'userid' only if it's at the end of line
^userid$	matches string 'userid' only if it's the only string in line
user.d	matches strings like 'userid', 'usercd', 'user6d' and so on

\w	an alphanumeric character (including "_")
\W	a nonalphanumeric
\d	a numeric character
\D	a non-numeric
\s	any space (same as [ \t\n\r\f])
\S	a non space

user\dd	matches strings like 'user1d', ''user6d' etc. but not 'userad', 'userbd' etc.
user[\w\s]d	matches strings like 'userid', 'user d', 'usercd' etc. but not 'user6d', 'user=d' etc.

*	zero or more, similar to {0,}
+	one or more, similar to {1,}
?	zero or one, similar to {0,1}
{n}	exactly n times
{n,}	at least n times
{n,m}	at least n but not more than m times
*?	zero or more, similar to {0,}?
+?	one or more, similar to {1,}?
??	zero or one, similar to {0,1}?
{n}?	exactly n times
{n,}?	at least n times
{n,m}?	at least n but not more than m times

Syntax of Regular Expressions

SLXProfiler

Syntax of Regular Expressions

Metacharacters - line separators

Metacharacters - predefined classes

Metacharacters - iterators

Metacharacters - alternatives

Metacharacters - subexpressions

Metacharacters - backreferences

Get in touch

(userid){8,10}	matches strings which contain 8, 9 or 10 instances of the 'userid'
user([0-9]\|a+)d	matches 'user0d', 'user2d' , 'userid', 'useriid', 'useriiid' etc.

(.)\1+	matches 'aaaa' and 'cc'
(.+)\1+	also matches 'abab' and '123123'
(['"]?)(\d+)\1	matches '"13" (in double quotes), or '4' (in single quotes) or 77 (without quotes) etc.