Introduction to XPath Syntax
XPath enables you to locate any one or more nodes within an XML document, often by using multiple alternate routes. In essence, XPath provides the syntax for performing basic queries upon your XML document data. It works by utilizing the ability to work with XML documents as hierarchically structured data sets.
All XML documents can be represented as a hierarchy or tree of nodes. This aspect of XML shares a similarity to how paths are encoded in file system URLs, which are used in Windows Explorer to produce tree views of files and folders on your computer. The following table compares some of the analogous features of both XPath and file system URLs, both of which are used to pass information.
File System URLs | XPath Patterns |
---|---|
Hierarchy comprised of directories and files in a file system. | Hierarchy comprised of elements and other nodes in an XML document. |
Files at each level have unique names. | Element names at each level might not be unique. |
URLs always identify a single file. | XPath patterns identify a set of all the matching elements that match the pattern, which can be either one or many nodes. |
Evaluated relative to a particular directory, called the "current directory." | Evaluated relative to a particular node called the "context" for the query. |
The following XML sample document shows a simple hierarchy, which can be used to demonstrate some of the facilities of XPath.
<?xml version='1.0'?> <authors> <author period="modern"> <name>Eva Corets</name> <nationality>British</nationality> </author> <author> <name>Cynthia Randall</name> <nationality>Canadian</nationality> </author> <author period="modern"> <name>Paula Thurman</name> <nationality>British</nationality> </author> </authors>
A basic XPath pattern describes a path through the XML hierarchy with a slash-separated list of child element names. For example, starting from the document root of the previous sample, the following pattern traverses down through the hierarchy to the <name>
elements.
authors/author/name
The XPath pattern identifies all elements that match the path. In addition to describing an exact path down a known hierarchy, XPath can include wildcards for describing unknown elements and selecting a set of nodes. This provides the basics of an XML query facility. For example, an element of any name can be represented by the "*" wildcard as shown in the following pattern.
authors/*/name
The preceding sample identifies all the <name>
elements that are grandchildren (children of any child element for the <authors>
element), but without requiring them to be children of an <author>
element. Here is another example that can be used to find both <name>
and <nationality>
elements in our sample document.
authors/author/*
Using additional patterns within square brackets can specify branches on the path. For example, the following query describes a branch on the <author>
element, indicating that only the <author>
elements with <nationality>
children should be considered as a pattern match.
authors/author[nationality]/name
This becomes even more useful for the sample data when comparisons are added. The following query returns the names of Canadian authors.
Note comparisons can be used only within brackets.
authors/author[nationality='Canadian']/name
Attributes are indicated in a query by preceding the name of the attribute with "@". The attribute can be tested as a branch off the main path, or the query can identify attribute nodes. The following examples return authors from the modern period, and just the two period attributes, respectively.
authors/author[@period="modern"] authors/author/@period
The above XPath expressions are also presented in the XML Tutorial Application. To practice using XPath, build and run the interactive tutorial to see which XML nodes are selected by various XPath patterns and expressions.
For a complete list of XPath features supported by MSXML, see XPath Syntax.
Other Resources
XML Path Language (XPath) Version 1.0