About XML documents in Word

Microsoft Office Word 2003

Show All Show All

About XML documents in Word

Note  XML features, except for saving documents as XML with the Word XML schema, are available only in Microsoft Office Professional Edition 2003 and stand-alone Microsoft Office Word 2003.

Why XML?

Extensible Markup Language (XML) enables you to organize and work with documents and data in ways that were previously impossible or very difficult. By using custom XML schemas, you can now identify and extract specific pieces of business data from ordinary business documents.

For example, an invoice that contains the name and address of a customer or a report that contains last quarter's financial results are no longer static documents. The information they contain can be passed to a database or reused elsewhere, outside of the documents.

The ability to save a Microsoft Word document in standard XML format helps separate its content from the confines of the document. The content becomes available for automated data-mining and repurposing processes. The content can easily be searched and even modified by processes other than Word, such as server-based data processing.

Because Word is capable of representing its documents as XML, automated server-based processes can now generate Word documents on the fly by pulling together data from various sources. Such a document could then easily be updated on a regular basis, eliminating the manual search for relevant data and unnecessary retyping.

Word and XML

Microsoft Word enables you to work with XML documents in two ways:

  • Use the Word XML schema    You can create a document in Word as you normally would and then save it as an XML document. Word uses its own XML schema, WordML, to apply XML tags that store information, such as file properties, and define the structure of the document, such as its paragraphs, headings, and tables. Word also uses XML tags to store formatting and layout information, according to the Word XML schema.
  • Use any XML schema    You can create or open a document in Word, attach any custom XML schema to it, and apply XML tags to the content of the document. When you save this document as an XML document, the XML tags define the structure of the document in terms of the XML schema that is attached to it.

    When you save the document, by default both the Word schema and the custom schema are attached to the document, preserving the data as defined by the custom schema and the rich formatting as defined by the Word XML schema. You also have the option of saving the document as data only, according to the custom schema.

    Whether you use the built-in Word XML schema for a Word document structure or attach your own schema for a structure that is more suitable for your business, any software that can parse XML can read and process the data in a document that you save as an XML document (.xml file).

    For example, if the custom schema is for résumé data, the XML tags in the document will define the structure of the document in terms of name, address, work experience, education, and so on. When you save the document, you have both a richly formatted document that looks professional when printed and a data file that can be processed by any program that can read XML.

You can also store XML data in a document that you save as a Word document (.doc) or template (.dot). However, only Word will be able to read or process the XML.

XML tagging

When a custom XML schema is attached to a document, the XML Structure task pane provides a list of elements that are defined in the schema. You apply XML tags to the document by selecting document content and then choosing an element from the list. If the schema defines attributes for an element, you can specify these as well in the XML Structure task pane.

Note  You can attach more than one schema to a document. Elements from all attached schemas are available in the list of elements in the XML Structure task pane.

A check box on the pane enables you to see the XML tags inline, in the context of the document.

If the structure of the document violates the rules of the schema, a purple wavy line marks the spot in the document, and the XML Structure task pane reports the violation.

XSL Transformations

Upon opening and saving XML documents, you can apply Extensible Stylesheet Language Transformation (XSLT) files that render the XML data in a particular format. For example, you could have one XSLT that presents data as a specification and another XSLT that presents the same data as a parts list, where quantities and prices are calculated.

ShowXSLTs applied when opening a document

An XML document may have more than one XSLT associated with it. When this is the case, you must select the XSLT that you want to use to display the document. You do this in the XML Document pane, where the available XSLTs (data views) are listed.

If no XSLT is associated with an XML document, then Word opens it using its default XSLT, or "Data only view."

If the Word XML schema is attached to the document, Word opens the document without applying an XSLT, even if one is associated with the document.

Note  Rather than applying an XSLT manually, you can define solutions that associate XSLTs with certain types of XML documents. You make this association in the Schema Library, which you can access on the XML Schema tab of the Templates and Add-ins dialog box (Tools menu).

ShowXSLTs applied when saving a document

You can apply an XSLT when you save an XML document by selecting the Apply transform check box and browsing to the XSLT file.

Caution  If you apply an XSLT when you save the file, Word discards any data that the XSLT does not use.