Modeling Documents as Node Trees
The DOM provides you with an interface for loading, accessing, manipulating, and serializing XML documents. The DOM provides a representation of a complete XML document stored in memory, providing random access to the contents of the entire document. The DOM allows applications to rely on the logic provided by the MSXML parser to handle XML-based information, using its facilities rather than writing custom code to read and process XML.
When the MSXML parser loads an XML document into a DOM, it reads it from start to finish and creates a logical model of nodes from the structures and content within the XML document. The document itself is considered a single node that contains all of the other nodes, including a node representing the root element, which, in turn, contains all of the element, attribute, and text nodes in the document.
Example
The following XML document has a simple multi-tier structure.
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="show_book.xsl"?> <!DOCTYPE catalog [ <!NOTATION XLS PUBLIC "http://www.microsoft.com/office/excel/"> <!ELEMENT COLLECTION (DATE? , BOOK+) > <!ATTLIST COLLECTION xmlns:dt CDATA #FIXED "urn:schemas-microsoft-com:datatypes"> <!ELEMENT BOOK (TITLE, AUTHOR, PUBLISHER) > <!ELEMENT DATE (#PCDATA) > <!ELEMENT TITLE (#PCDATA) > <!ELEMENT AUTHOR (#PCDATA) > <!ELEMENT PUBLISHER (#PCDATA) > ]> <!--catalog last updated 2000-11-01--> <catalog xmlns="http://www.example.com/catalog/"> <book id="bk101"> <author>Gambardella, Matthew</author> <title>XML Developer's Guide</title> <genre>Computer</genre> <price>44.95</price> <publish_date>2000-10-01</publish_date> <description><![CDATA[An in-depth look at creating applications with XML, using <, >,]]> and &.</description> </book> <book id="bk109"> <author>Kress, Peter</author> <title>Paradox Lost</title> <genre>Science Fiction</genre> <price>6.95</price> <publish_date>2000-11-02</publish_date> <description>After an inadvertant trip through a Heisenberg Uncertainty Device, James Salway discovers the problems of being quantum.</description> </book> </catalog>
After MSXML parsing, the top two levels of the node structure representing this document will look like this.
The topmost node is the document itself, which contains all of the other nodes. Immediately within the document are nodes representing the XML declaration, the style sheet processing instruction, the DOCTYPE declaration, and the root element for the document, in this case, catalog
.
The catalog
element contains the real content of the document, and its structure is shown below.
This part of the DOM contains element, attribute, text, and CDATA nodes. (The character references and built-in entities are converted to ordinary text by the parser, but the CDATA section has its own node.)