Enforcing Character Encoding with DOM

MSXML 5.0 SDK

Microsoft XML Core Services (MSXML) 5.0 for Microsoft Office - DOM Developer's Guide

Enforcing Character Encoding with DOM

In some cases, an XML document is passed to and processed by an application—for example, an ASP page—that cannot properly decode rare or new characters. When this happens, you might be able to work around the problem by relying on DOM to handle the character encoding. This bypasses the incapable application.

For example, the following XML document contains the character entity ("€") that corresponds to the Euro currency symbol (). The ASP page, incapable.asp, cannot process currency.xml.

XML Data (currency.xml)

<?xml version="1.0" encoding="utf-8"?>
<currency>
   <name>Euro</name>
   <symbol>&#8364;</symbol>
   <exchange>
      <base>US$</base>
      <rate>1.106</rate>
   </exchange>
</currency>

ASP Page (incapable.asp)

<%@language = "javascript"%>
<%
   var doc = new ActiveXObject("Msxml2.DOMDocument");
   doc.async = false;
   if (doc.load(Server.MapPath("currency.xml"))==true) {
      Response.ContentType = "text/xml";
      Response.Write(doc.xml);
   }
%>

When limited.asp attempts to process currency.xml, an error results in the Response.Write(doc.xml) instruction. However, you can replace this instruction with the following line:

doc.save(Response); 

With this line, the error does not occur. The ASP code produces the correct output in a Web browser, as follows:

  <?xml version="1.0" encoding="utf-8" ?> 
  <currency>
    <name>Euro</name> 
    <symbol></symbol> 
    <exchange>
      <base>US$</base> 
      <rate>1.106</rate> 
    </exchange>
  </currency>

The effect of the change in the ASP page is to let the DOM object (doc)—instead of the Response object on the ASP page—handle the character encoding.