Working with the DOM

MSXML 5.0 SDK

Microsoft XML Core Services (MSXML) 5.0 for Microsoft Office - DOM Developer's Guide

Working with the DOM

The DOM allows applications to work with XML document structures and information as program structures rather than text streams. Applications and scripts can read and manipulate these structures without knowing the details of XML syntax, taking advantage of the facilities built into the DOM API of MSXML.

The DOM uses two key abstractions: a tree-like hierarchy and nodes that represent document content and structures. The hierarchy is composed of these nodes, which may contain or be contained by other nodes. For developers, this means that much of the work of XML processing requires navigating this tree structure to find or modify the information it contains. Working with XML requires thinking of information in terms of nested containers, and making sure that information is put into or retrieved from the right container.

The DOM treats nodes as generic objects, making it possible to create a script that loads a document and then traverses all of the nodes, reporting what it finds in the tree.

The following are exposed by the XML DOM.

The DOM programming interfaces enable applications to traverse the tree and manipulate its nodes. Each node is defined as a specific node type, according to the XML DOM enumerated constants, which also define valid parent and child nodes for each node type. For most XML documents, the most common node types are element, attribute, and text. Attributes occupy a special place in the model because they are not considered child nodes of a parent, and are treated more like properties of elements. An additional programming interface, the IXMLDOMNamedNodeMap, is provided for attributes.

Examples

This sample Active Server Pages (ASP) script uses the MSXML parser to parse a document into a DOM tree, then move down the tree from the root node and report the kinds of nodes it encounters and their content.

VBScript

The first version uses Microsoft Visual Basic® Scripting Edition (VBScript) to load the document and walk the tree. If no form input is provided, it presents the user with a form to gather the URL of an XML document. The user then submits that back to the script, which parses the document and presents a tree.

<%@LANGUAGE=VBScript%>
<html>
<head>
<title>Tree walk test - VBScript</title>
</head><body>
<%

function attribute_walk(node)

  For i=1 to indent
    Response.Write("&nbsp;")
  Next
  For Each attrib In node.attributes
   Response.Write("|--")
   Response.Write(attrib.nodeTypeString)
   Response.Write(":")
   Response.Write(attrib.name)
   Response.Write("--")
   Response.Write(attrib.nodeValue)
   Response.Write("<br />")
  Next
end function


function tree_walk(node)
dim nodeName

indent=indent+2

For Each child In node.childNodes
  For i=1 to indent
    Response.Write("&nbsp;")
  Next
  Response.Write("|--")
  Response.Write(child.nodeTypeString)
  Response.Write("--")
  If child.nodeType<3 Then
    Response.Write(child.nodeName)
    Response.Write("<br />")
  End If
  If (child.nodeType=1) Then 
    If (child.attributes.length>0) Then
      indent=indent+2
      attribute_walk(child)
      indent=indent-2
    End If
  End If
  If (child.hasChildNodes) Then
    tree_walk(child)
  Else
    Response.Write child.text
    Response.Write("<br />")
  End If
Next

  indent=indent-2

end function

xmlFile=Request.Form("fileURI")

Dim root
Dim xmlDoc
Dim child
Dim indent

indent=0

Set xmlDoc = CreateObject("Msxml2.DOMDocument.5.0")
xmlDoc.async = False
xmlDoc.validateOnParse=False
xmlDoc.load(xmlFile)
If xmlDoc.parseError.errorcode = 0 Then 
'Walk from the root to each of its child nodes:
  Response.Write("<pre>")
  tree_walk(xmlDoc)
  Response.Write("</pre>")
Else
%>
<h1>XML Parsing - DOM Tree Walk Demo</h1>
<form id="location" method="post" action="">
<input type="text" name="fileURI" maxlength="255" size="20" id="XMLurl" /> 
<br />
<input type="submit" name="submit" value="submit" />
</form>
<% End If%>
</body></html>

At the bottom, the script contains a main routine that either loads a document and passes it to the tree walker or presents a form asking which document to load. This script relies on the tree_walk function, a recursive function that moves from node to node in the tree and presents a suitably formatted version of the contents. That function in turn relies on an attribute_walk function to present attribute content, because attribute nodes are not considered children of element nodes within the DOM.

JScript

The Microsoft JScript® version is similar to the VBScript version. However, it requires extra lines of code to avoid overwriting variables during the recursive tree walking.

<%@LANGUAGE=JScript%>
<html>
<head>
<title>Tree walk test - JScript</title>
</head><body>
<%

function attribute_walk(node) {

  for (k=1; k<indent; k++) {
    Response.Write("&nbsp;");
  }
  for  (m=0; m<node.attributes.length; m++){
   attrib = node.attributes.item(m);
   Response.Write("|--");
   Response.Write(attrib.nodeTypeString);
   Response.Write(":");
   Response.Write(attrib.name);
   Response.Write("--");
   Response.Write(attrib.nodeValue);
   Response.Write("<br />");
  }
} //end attribute_walk

function tree_walk(node) {

indent=indent+2;
for (current=0; current<node.childNodes.length; current++) {
  child=node.childNodes.item(current);
  for (j=1; j<indent; j++){
    Response.Write("&nbsp;");
  }
  Response.Write("|--");
  Response.Write(child.nodeTypeString);
  Response.Write("--");
  if (child.nodeType<3) {
    Response.Write(child.nodeName);
    Response.Write("<br />");
  }
  if (child.nodeType==1) { 
    if (child.attributes.length>0) {
      indent=indent+2;
      attribute_walk(child);
      indent=indent-2;
    }
  }
  if (child.hasChildNodes) {
//store information so recursion is possible
    depthList[depth]=current;
    depth=depth+1;
    tree_walk(child);
//return from recursion
    depth=depth-1;
    current=depthList[depth];

  }else{
    Response.Write (child.text);
    Response.Write("<br />");
  }
}

  indent=indent-2;

}

//recursion-tracking variables
depth=0;
depthList=new Array();

indent=0;
xmlFile=new String();

xmlFile=Request.Form("fileURI");
xmlFile=""+xmlFile; //makes string clean for passing to MSXML

xmlPresented=false;

var xmlDoc = new ActiveXObject("Msxml2.DOMDocument.5.0");
xmlDoc.async = false;
xmlDoc.validateOnParse=false;
//xmlFile="http://127.0.0.1/ms/local.xml"
if ((xmlFile)) {
 xmlDoc.load(xmlFile);
 if (xmlDoc.parseError.errorcode == null) {
  Response.Write("<pre>");
  tree_walk(xmlDoc);
  Response.Write("</pre>");
  xmlPresented==true;
 }
}
if (xmlPresented==false){
%>
<h1>XML Parsing - DOM Tree Walk Demo</h1>
<form id="location" method="post" action="">
<input type="text" name="fileURI" maxlength="255" size="20" id="fileURI" /> 
<br />
<input type="submit" name="submit" value="submit" />
</form>
<% } %>
</body></html>