Syntax of Regular Expressions

SLXProfiler

^

start of line

$

end of line

\A

start of text

\Z

end of text

.

any character in line

Examples

^userid

matches string 'userid' only if it's at the beginning of line

userid$

matches string 'userid' only if it's at the end of line

^userid$

matches string 'userid' only if it's the only string in line

user.d

matches strings like 'userid', 'usercd', 'user6d' and so on

 

\w

an alphanumeric character (including "_")

\W

a nonalphanumeric

\d

a numeric character

\D

a non-numeric

\s

any space (same as [ \t\n\r\f])

\S

a non space

Examples

user\dd

matches strings like 'user1d', ''user6d' etc. but not 'userad', 'userbd' etc.

user[\w\s]d

matches strings like 'userid', 'user d', 'usercd' etc. but not 'user6d', 'user=d' etc.

 

 

Any item in a regular expression may be followed by iterators. Using these metacharacters you can specify the number of occurrences of a previous character, metacharacter or subexpression.

 

*

zero or more, similar to {0,}

+

one or more, similar to {1,}

?

zero or one, similar to {0,1}

{n}

exactly n times

{n,}

at least n times

{n,m}

at least n but not more than m times

*?

zero or more, similar to {0,}?

+?

one or more, similar to {1,}?

??

zero or one, similar to {0,1}?

{n}?

exactly n times

{n,}?

at least n times

{n,m}?

at least n but not more than m times

Examples

user.*d

matches strings like 'userid',  'useralkjdflkj9d' and 'userd'

user.+d

matches strings like 'userid', 'useralkjdflkj9d' but not 'userd'

user.?d

matches strings like 'userid', 'userrid' and 'userd' but not 'useralkj9d'

useri{2}d

matches the string 'useriid'

useri{2,}d

matches strings like 'useriid', 'useriiid', 'useriiiid' etc.

useri{2,3}d

matches strings like 'useriid', or 'useriiid'  but not 'useriiiid'

 

 

You can specify a series of alternatives for a pattern using "|'' to separate them, so that fee|fie|foe will match any of "fee'', "fie'', or "foe'' in the target string (as would f(e|i|o)e). It is common practice to include alternatives in parentheses to minimize confusion about where they start and end.

Example

user(id|user)

matches strings 'userid' or 'useruser'.

 

Subexpressions are numbered based on the left to right order of their opening parenthesis.

Examples

(userid){8,10}

matches strings which contain 8, 9 or 10 instances of the 'userid'

user([0-9]|a+)d

matches 'user0d', 'user2d' , 'userid', 'useriid', 'useriiid' etc.

 

Metacharacters \1 through \9 are interpreted as backreferences. \<n> matches a previously matched subexpression #<n>.

Examples

(.)\1+

matches 'aaaa' and 'cc'

(.+)\1+

also matches 'abab' and '123123'

(['"]?)(\d+)\1

matches '"13" (in double quotes), or '4' (in single quotes) or 77 (without quotes) etc.

 

Syntax of Regular Expressions

Regular Expressions are used to specify patterns of text for searches.

Simple Matches

Single characters match themselves unless they are metacharacters with special meaning. Characters that normally function as metacharacters or escape sequences can be interpreted literally by preceding them with a backslash "\".

Examples

userid

matches string 'userid'

\^UserID

matches '^UserID'

Escape Sequences

Characters can be specified using an escape sequence syntax similar to that used in C and Perl.

Supported escape sequences

\xnn

char with hex code nn

\x{nnnn}

char with hex code nnnn (one byte for plain text and two bytes for Unicode)

\t

tab (HT/TAB), same as \x09

\n

newline (NL), same as \x0a

\r

car.return (CR), same as \x0d

\f

form feed (FF), same as \x0c

\a

alarm (bell) (BEL), same as \x07

\e

escape (ESC), same as \x1b

Examples

user\x20id

matches 'user id' (note the space in the middle)

\tuserid

matches 'userid' predefined by tab

Character Classes

You can specify a character class by enclosing a list of characters in [] which will match any one character from the list.

If the first character after the "['' is "^'', the class matches any character not in the list.

Examples

user[aeiou]d

finds strings 'userad', 'usered' etc. but not 'userbd', 'usercd' etc.

user[^aeiou]d

find strings 'userbd', 'usercd' etc. but not 'userad', 'usered' etc.

 

 

The "-'' character is used to specify a range within a list. If you want "-'' itself to be a member of a class, put it at the start or end of the list or escape it with a backslash.

Examples

[-az]

matches 'a', 'z' and '-'

[az-]

matches 'a', 'z' and '-'

[a\-z]

matches 'a', 'z' and '-'

[a-z]

matches all twenty six small characters from 'a' to 'z'

[\n-\x0D]

matches any of the ASCII characters 10,11,12, or 13

[\d-t]

matches the digits '-' and 't'.

[]-a]

matches any character from ']' to 'a'.

Metacharacters

Metacharacters are special characters which are the essence of Regular Expressions. The different types of metacharacters are described below.

 

Metacharacters - line separators
Metacharacters - predefined classes
Metacharacters - iterators
Metacharacters - alternatives
Metacharacters - subexpressions
Metacharacters - backreferences

Modifiers

Modifiers are used to change the behavior of regular expressions.

 

i

Used for case-insensitive pattern matching.

m

Treats a string as multiple lines.

s

Treats a string as a single line.

g

Used as a non-standard modifier. Switching it Off switches all following operators into non-greedy mode (by default this modifier is On). If modifier /g is Off then '+' works as '+?', '*' as '*?' etc.

x

Tells the regular expression to ignore whitespace that is neither backslashed nor within a character class. You can use this to break a regular expression into more readable parts.

 

Related Topics

Set Filter