Working with the DOM
The DOM allows applications to work with XML document structures and information as program structures rather than text streams. Applications and scripts can read and manipulate these structures without knowing the details of XML syntax, taking advantage of the facilities built into the DOM API of MSXML.
The DOM uses two key abstractions: a tree-like hierarchy and nodes that represent document content and structures. The hierarchy is composed of these nodes, which may contain or be contained by other nodes. For developers, this means that much of the work of XML processing requires navigating this tree structure to find or modify the information it contains. Working with XML requires thinking of information in terms of nested containers, and making sure that information is put into or retrieved from the right container.
The DOM treats nodes as generic objects, making it possible to create a script that loads a document and then traverses all of the nodes, reporting what it finds in the tree.
The following are exposed by the XML DOM.
The DOM programming interfaces enable applications to traverse the tree and manipulate its nodes. Each node is defined as a specific node type, according to the XML DOM enumerated constants, which also define valid parent and child nodes for each node type. For most XML documents, the most common node types are element, attribute, and text. Attributes occupy a special place in the model because they are not considered child nodes of a parent, and are treated more like properties of elements. An additional programming interface, the IXMLDOMNamedNodeMap
, is provided for attributes.
Examples
This sample Active Server Pages (ASP) script uses the MSXML parser to parse a document into a DOM tree, then move down the tree from the root node and report the kinds of nodes it encounters and their content.
VBScript
The first version uses Microsoft Visual Basic® Scripting Edition (VBScript) to load the document and walk the tree. If no form input is provided, it presents the user with a form to gather the URL of an XML document. The user then submits that back to the script, which parses the document and presents a tree.
<%@LANGUAGE=VBScript%> <html> <head> <title>Tree walk test - VBScript</title> </head><body> <% function attribute_walk(node) For i=1 to indent Response.Write(" ") Next For Each attrib In node.attributes Response.Write("|--") Response.Write(attrib.nodeTypeString) Response.Write(":") Response.Write(attrib.name) Response.Write("--") Response.Write(attrib.nodeValue) Response.Write("<br />") Next end function function tree_walk(node) dim nodeName indent=indent+2 For Each child In node.childNodes For i=1 to indent Response.Write(" ") Next Response.Write("|--") Response.Write(child.nodeTypeString) Response.Write("--") If child.nodeType<3 Then Response.Write(child.nodeName) Response.Write("<br />") End If If (child.nodeType=1) Then If (child.attributes.length>0) Then indent=indent+2 attribute_walk(child) indent=indent-2 End If End If If (child.hasChildNodes) Then tree_walk(child) Else Response.Write child.text Response.Write("<br />") End If Next indent=indent-2 end function xmlFile=Request.Form("fileURI") Dim root Dim xmlDoc Dim child Dim indent indent=0 Set xmlDoc = CreateObject("Msxml2.DOMDocument.5.0") xmlDoc.async = False xmlDoc.validateOnParse=False xmlDoc.load(xmlFile) If xmlDoc.parseError.errorcode = 0 Then 'Walk from the root to each of its child nodes: Response.Write("<pre>") tree_walk(xmlDoc) Response.Write("</pre>") Else %> <h1>XML Parsing - DOM Tree Walk Demo</h1> <form id="location" method="post" action=""> <input type="text" name="fileURI" maxlength="255" size="20" id="XMLurl" /> <br /> <input type="submit" name="submit" value="submit" /> </form> <% End If%> </body></html>
At the bottom, the script contains a main routine that either loads a document and passes it to the tree walker or presents a form asking which document to load. This script relies on the tree_walk
function, a recursive function that moves from node to node in the tree and presents a suitably formatted version of the contents. That function in turn relies on an attribute_walk
function to present attribute content, because attribute nodes are not considered children of element nodes within the DOM.
JScript
The Microsoft JScript® version is similar to the VBScript version. However, it requires extra lines of code to avoid overwriting variables during the recursive tree walking.
<%@LANGUAGE=JScript%> <html> <head> <title>Tree walk test - JScript</title> </head><body> <% function attribute_walk(node) { for (k=1; k<indent; k++) { Response.Write(" "); } for (m=0; m<node.attributes.length; m++){ attrib = node.attributes.item(m); Response.Write("|--"); Response.Write(attrib.nodeTypeString); Response.Write(":"); Response.Write(attrib.name); Response.Write("--"); Response.Write(attrib.nodeValue); Response.Write("<br />"); } } //end attribute_walk function tree_walk(node) { indent=indent+2; for (current=0; current<node.childNodes.length; current++) { child=node.childNodes.item(current); for (j=1; j<indent; j++){ Response.Write(" "); } Response.Write("|--"); Response.Write(child.nodeTypeString); Response.Write("--"); if (child.nodeType<3) { Response.Write(child.nodeName); Response.Write("<br />"); } if (child.nodeType==1) { if (child.attributes.length>0) { indent=indent+2; attribute_walk(child); indent=indent-2; } } if (child.hasChildNodes) { //store information so recursion is possible depthList[depth]=current; depth=depth+1; tree_walk(child); //return from recursion depth=depth-1; current=depthList[depth]; }else{ Response.Write (child.text); Response.Write("<br />"); } } indent=indent-2; } //recursion-tracking variables depth=0; depthList=new Array(); indent=0; xmlFile=new String(); xmlFile=Request.Form("fileURI"); xmlFile=""+xmlFile; //makes string clean for passing to MSXML xmlPresented=false; var xmlDoc = new ActiveXObject("Msxml2.DOMDocument.5.0"); xmlDoc.async = false; xmlDoc.validateOnParse=false; //xmlFile="http://127.0.0.1/ms/local.xml" if ((xmlFile)) { xmlDoc.load(xmlFile); if (xmlDoc.parseError.errorcode == null) { Response.Write("<pre>"); tree_walk(xmlDoc); Response.Write("</pre>"); xmlPresented==true; } } if (xmlPresented==false){ %> <h1>XML Parsing - DOM Tree Walk Demo</h1> <form id="location" method="post" action=""> <input type="text" name="fileURI" maxlength="255" size="20" id="fileURI" /> <br /> <input type="submit" name="submit" value="submit" /> </form> <% } %> </body></html>