Overview of XPath Axes and Node Relationships

MSXML 5.0 SDK

Microsoft XML Core Services (MSXML) 5.0 for Microsoft Office - XPath Developer's Guide

Overview of XPath Axes and Node Relationships

In addition to knowing the seven types of nodes, you also need to understand the possible kinds of relationships between and among nodes in a document. Each relationship corresponds to what XPath calls an axis. For more information, see Use XPath Axes to Navigate through XML Data.

The terms used for the axes assume that the document tree represents a set of family relationships. The axes that can be expressed with XPath are:

Two additional relationships apply only to element nodes:

Parent

The parent node of an element, attribute, processing instruction, comment, or text node is the element immediately above it in the tree—the element which contains it. For the root element, and any comments or processing instructions which precede or follow it, the parent is the root node. The root node itself has no parent. The parent of a namespace node is the element which declares it. In the document diagrammed above, for instance:

  • The parent of the <books> element is the root node.
  • The parent of each of the catnum attributes is the corresponding <book> element.
  • The parent of the comment, "Are we sure this guy's name is spelled right???" is the first <book> element.
  • The parent of both the <?xsl-stylesheet?> processing instruction and the comment at the end of the document is the root node.

Child

A node's child is any node immediately below it in the hierarchy of a document's nodes, with some exceptions. The exceptions are that neither attribute nor namespace nodes are considered children of their respective elements. Therefore, if an XPath expression locates all children of a given element, it locates only elements, text nodes, processing instructions, and comments immediately subordinate to that element.

processing instructions and comments in the document prolog or following the root element are considered children of the root node, as is the root element.

In the document diagrammed above, for instance:

  • The children of the root node are the <?xsl-stylesheet?> processing instruction, the root <books> element, and the comment which follows the root element's end tag.
  • The children of the <books> element node are the two <book> elements. Note that the catdate attribute is not considered a child of the <books> element node, despite the fact that <books> is considered catdate's parent.
  • The children of the first <book> element are the first <title> element, the comment "Are we sure this guy's name is spelled right???" and the first <author> element.
  • The second <title> element has only one child, the text node "For Love of a Toothpick."

Ancestor

An ancestor of any node in the tree is any node at a higher level in the tree than the node in question, including its parent. The root node has no ancestors; on the other hand, the root node is an ancestor of all other nodes in the tree.

For example, in the previous document diagrammed above:

  • The catdate attribute's ancestors are the <books> element and the root node.
  • The <?xml-stylesheet?> processing instruction, the root <books> element, and the comment following the <books> element each has a single ancestor, the root node.
  • Ancestors of the text node "Frey, Jörg" are the second <author> element, the second <book> element, the root <books> element, and the root node.

Descendant

A descendant node is any node (including children) which is subordinate to a given node in the document tree. Everything in the document tree is a descendant of the root node, except the root node itself and any attribute or namespace nodes.

For example, in the document diagrammed earlier in this topic:

  • The catdate and catnum attributes have no descendants. (Attributes have no children.)
  • Descendants of the first <book> element are the first <title> element and its text node, "Jambing on the Trixles"; the comment, "Are we sure this guy's name is spelled right???"; and the first <author> element and its text node, "Randall, Tristan." Since an attribute is not considered a child of its defining element, the catnum="id2345" attribute is not a descendant of this <book> element (or indeed of anything else in the document).

Ancestor-or-self

Ancestor-or-self nodes of a given node in the tree include all ancestors of that node and the node itself. Therefore, although the root node has no ancestors, an XPath expression which locates the ancestor-or-self nodes of the root node will locate the root node itself.

Some examples from the document diagrammed in this topic are:

  • The ancestor-or-self nodes of the concluding comment include the root node and the comment itself.
  • The catdate attribute's ancestor-or-self nodes are the <books> element which defines it, the root node, and the catdate attribute itself.

Descendant-or-self

If a given node in the tree has any descendants, an XPath expression locating its descendant-or-self nodes will locate all those descendants, and the node itself. If it has no descendants, such an expression will locate only the node itself. Thus, given the document diagrammed in this topic:

  • The second <title> element's descendant-or-self nodes are its text node, "For Love of a Toothpick," and that <title> element itself.
  • The document's concluding comment has no descendants; therefore it has only a single descendant-or-self node, which is the comment itself.

Preceding

A preceding node, relative to a given node in the document tree, is any node which appears in the document before that node, except ancestors, attribute nodes, or namespace nodes. One way to think of this is that preceding nodes are those whose content occurs in its entirety before the start of the node in question. Since the root node contains all other nodes in the document, it will never be located among a given node's preceding nodes; likewise, the root node itself has no preceding nodes. Given the document diagrammed earlier in this topic, for example:

  • The root <books> element has only one preceding node, which is the <?xml-stylesheet?> processing instruction.
  • Everything in the document (except the attributes and the document root node) is a preceding node relative to the concluding comment.
  • The first <author> element's preceding nodes are the comment, "Are we sure this guy's name is spelled right???", the first <title> element and its text node ("Jambing on the Trixles"), and the <?xml-stylesheet?> processing instruction. Since they are ancestors of the first <author> element, the first <book> element, the root <books> element, and the root node all terminate after the first <author> element closes, therefore they are not among its preceding nodes.

Following

Following nodes are the reverse of preceding ones: They include any nodes (except descendant, attribute, and namespace nodes) which come after a given node. Since the root node contains all other nodes in the document, it is a following node of none of them, and itself has no following nodes.

Based on the document diagrammed in this topic, we can say:

  • The root <books> element has only one following node, which is the document's concluding comment.
  • The concluding comment has no following nodes.
  • The first <author> element's following nodes are the second <book> element (and all of the second <book> element's descendants) and the concluding comment. Because the text node "Randall, Tristan" is a descendant (specifically, a child) of the first <author> element, it is not considered a following node for that element.

Preceding-sibling

Sibling relationships in a family identify children of the same parent, relative to one another. Therefore preceding-sibling nodes are a subset of all siblings, including only those which appear in the document before the node in question. Since attribute and namespace nodes can never be child nodes, they can never be found among a given nodes' preceding-siblings.

Note   XPath does not define a simple sibling relationship, only preceding-sibling and following-sibling.

For instance, referring to the document described earlier in this topic:

  • Each of the <author> elements has one preceding-sibling node, which is the respective <title> element. However, the first <author> element also has another preceding-sibling, which is the comment, "Are we sure this guy's name is spelled right???"
  • The root <books> element has one preceding-sibling, the <?xml-stylesheet?> processing instruction.
  • The concluding comment has two preceding-siblings: the root <books> element and the <?xml-stylesheet?> processing instruction.

Following-sibling

Among a given node's following-siblings, you will find all those nodes (except attribute and namespace nodes, which can never be child nodes) which share that node's parent and appear after the node in question in the document.

Note   XPath does not define a simple sibling relationship, only preceding-sibling and following-sibling.

Examples from the document diagrammed earlier in this topic include:

  • Neither of the <author> elements has any following-sibling nodes.
  • The root <books> element has one following-sibling, which is the concluding comment.
  • The concluding comment has no following-siblings.

Self

In some cases, you need an XPath expression which locates only the node you're already dealing with. For such cases, XPath provides the "self relationship." Every node in any given document has, of course, one and only one self node.

Attribute

Since attributes can never be found among the parent, child, ancestor, descendant, preceding, following, preceding-sibling, or following-sibling nodes of a given node, how do you locate them with XPath? The answer lies in the special attribute relationship. An element's attribute nodes are any attributes which it declares, and only elements can have attribute nodes.

Note   A processing instruction's pseudo-attributes cannot be located this way.

Using the sample document diagrammed earlier in this topic, for instance:

  • Each of the <book> elements has a single attribute node, the respective catnum attribute.
  • The root <books> element likewise has one attribute node, the catdate attribute.
  • None of the other elements in this sample document have any attribute nodes.

Namespace

An element has a namespace node for every namespace which is in scope for it. Because namespaces are implicitly inherited by all elements descended from an element which declares a namespace, the following hold true:

  • An element may have a namespace node even if the element does not itself declare a namespace.
  • An element may have multiple namespace nodes, one corresponding to each namespace that it declares and one corresponding to each namespace declared by an ancestor element.

See Also

Sample XML File for XPath Tree Model