Useful Boolean Predicates in XPath Expressions

MSXML 5.0 SDK

Microsoft XML Core Services (MSXML) 5.0 for Microsoft Office - XPath Developer's Guide

Useful Boolean Predicates in XPath Expressions

A predicate expression (that is, an XPath expression within square brackets) takes the current context and applies a test to it. If the test fails for the context, then it is eliminated from the node-set that the context belongs to, otherwise it stays in. You can use this technique for performing very fine grain analysis of the XML tree.

Predicates return Booleans

For each node, predicates return either true or false, indicating that the given node either satisfies or doesn't satisfy the predicate. Numeric expressions will always evaluate to true unless the result is 0, while an empty string is considered to be a false value within a predicate.

Use OR for conjoining: [a or b]

will match either a or b.

Use AND for filtering: [a and b]

will match only if both a and b nodes exist.

Multiple predicates

You can have a predicate filter of the form path[A][B], or path1[A]/path2[B]. XPath expressions work from left to right, so that with the expression path1[A]/path2[B], path1[A] will be determined first and only those elements that remain have path2[B] applied.

AND is faster than sequential predicates

The MSXML parser can handle Boolean evaluations faster than it can node-set creation or reduction. As a consequence, while both path[A][B] and path[A and B] will generate the same results, the use of 'and' will improve the performance of your XPath query significantly.

AND short-circuits expressions

One reason for the efficiency when using AND is that if the first part of an intersection (an AND clause) is false, the second part will not be evaluated. As a consequence, when building AND XPath expressions, place the most efficient clause first to cut down on the number of searches, especially if one clause is especially complex. For example, the following XPath expression selects all elements that have the same name as the parameter $price.type before applying the filtering algorithm to make sure the price is in between the low and high price:

//*[name(.)=$price.type and (number(.) > $price.low and number(.) < $price.high)]