Controlling White Space Using the MSXML Processor

MSXML 5.0 SDK

Microsoft XML Core Services (MSXML) 5.0 for Microsoft Office - XSLT Developer's Guide

Controlling White Space Using the MSXML Processor

In most cases, the behavior of the MSXML processor conforms to white space handling rules imposed by the XML and XSLT specifications.

The following are exceptions.

  • Adjacent text nodes are not treated as one node. Adjacent text nodes can appear if:
    • The user inserts them into the tree through the DOM API.
    • CDATA sections border #PCDATA nodes.
    • An entity reference containing text borders a #PCDATA or CDATA node. Character references, such as 
 (to represent a newline), are automatically expanded by the parser and handled correctly.

    Few style sheets will encounter any of these conditions. Furthermore, even if adjacent text nodes exist, users seldom query for text nodes directly. In most cases, users obtain text nodes by referring to element and attribute nodes.

  • When the preserveWhitespace property is False, the white space stripped by the DOM is reduced to flags that mark where white space existed. These flags are used to output extra white space when <xsl:copy-of> is used, or when the text value of a node is retrieved. These flags act as virtual nodes that cannot be queried, but nevertheless behave like text nodes with a value of &#10; (newline) on output. This does not occur if preserveWhitespace is True, because flags are not used.
  • CDATA nodes containing only white space are never stripped.