Categories
The Unicode Character Database specifies a number of possible values for the General Category property and provides mappings from code points to specific character properties. The tables below specify the recognized values of the General Category property.
IsCategory
Syntax
IsCategory ::= Letters | Marks | Numbers | Punctuation | Separators | Symbols | Others
Letters
Syntax
Letters ::= 'L' [ultmo]?
The following table shows the properties for letters.
Property | Description |
---|---|
L | All letters |
Lu | Uppercase |
Ll | Lowercase |
Lt | Titlecase |
Lm | Modifier |
Lo | Other |
Marks
Syntax
Marks ::= 'M' [nce]?
The following table shows the properties for marks.
Property | Description |
---|---|
M | All marks |
Mn | Nonspacing |
Mc | Space combining |
Me | Enclosing |
Numbers
Syntax
Numbers ::= 'N' [dlo]?
The following table shows the properties for numbers.
Property | Description |
---|---|
N | All numbers |
Nd | Decimal digit |
Nl | Letter |
No | Other |
Punctuation
Syntax
Punctuation ::= 'P' [cdseifo]?
The following table shows the properties for punctuation.
Property | Description |
---|---|
P | All punctuation |
Pc | Connector |
Pd | Dash |
Ps | Open |
Pe | Close |
Pi | Initial quote (may behave like Ps or Pe depending on usage) |
Pf | Final quote (may behave like Ps or Pe depending on usage) |
Po | Other |
Separators
Syntax
Separators ::= 'Z' [slp]?
The following table shows the properties for separators.
Property | Description |
---|---|
Z | All separators |
Zs | Space |
Zl | Line |
Zp | Paragraph |
Symbols
Syntax
Symbols ::= 'S' [mcko]?
The following table shows the properties for symbols.
Property | Description |
---|---|
S | All symbols |
Sm | Math |
Sc | Currency |
Sk | Modifier |
So | Other |
Others
Syntax
Others ::= 'O' [cfon]?
The following table shows the properties for others.
Property | Description |
---|---|
O | All others |
Cc | Control |
Cf | Format |
Co | Private use |
Cn | Not assigned |
Note The Cs property is not included here. The Cs property identifies surrogate characters that do not occur at the level of character abstraction that XML instance documents use.
See Also
XML Schema Regular Expressions | XML Schema Regular Expressions Reference Chart | Data Type Facets