Author XML Schemas
What is an XML Schema?
An XML Schema is an XML-based syntax for defining how an XML document is marked up. XML Schema is a schema specification recommended by Microsoft and it has many advantages over document type definition (DTD), the initial schema specification for defining an XML model. DTDs have many drawbacks, including the use of non-XML syntax, no support for datatyping, and non-extensibility. For example, DTDs do not allow you to define element content as anything other than another element or a string. For more information about DTDs, see the World Wide Web Consortium (W3C) XML Recommendation. XML Schema improves upon DTDs in several ways, including the use of XML syntax, and support for datatyping and namespaces. For example, an XML Schema allows you to specify an element as an integer, a float, a Boolean, a URL, and so on.
Microsoft® XML Core Services (MSXML) 5.0 for Microsoft Office in Microsoft Internet Explorer 5.0 and later can validate an XML document with both a DTD and an XML Schema.
How can I create an XML Schema?
Run the mouse over the following XML document to reveal the schema declarations for each node.
<">class xmlns="x-schema:classSchema.xml">
<">student ">studentID="13429 ">
<">name>James Smith</name>
<">GPA>3.8</GPA>
</student>
</class>
You'll notice in the preceding document that the default namespace is x-schema:classSchema.xml
. This tells the parser to validate the entire document against the schema (x-schema) at the following URL (classSchema.xml
).
The following is the entire schema for the preceding document. The schema begins with the <Schema>
element containing the declaration of the schema namespace and, in this case, the declaration of the datatypes
namespace as well. The first, xmlns="urn:schemas-microsoft-com:xml-data"
, indicates that this XML document is an XML Schema. The second, xmlns:dt="urn:schemas-microsoft-com:datatypes"
, allows you to type element and attribute content by using the dt
prefix on the type
attribute within their ElementType
and AttributeType
declarations.
<Schema xmlns="urn:schemas-microsoft-com:xml-data" xmlns:dt="urn:schemas-microsoft-com:datatypes"> <AttributeType name='studentID' dt:type='string' required='yes'/> <ElementType name='name' content='textOnly'/> <ElementType name='GPA' content='textOnly' dt:type='float'/> <ElementType name='student' content='mixed'> <attribute type='studentID'/> <element type='name'/> <element type='GPA'/> </ElementType> <ElementType name='class' content='eltOnly'> <element type='student'/> </ElementType> </Schema>
The declaration elements that you use to define elements and attributes are described as follows.
Element | Description |
---|---|
<ElementType> |
Assigns a type and conditions to an element, and what, if any, child elements it can contain. |
<AttributeType> |
Assigns a type and conditions to an attribute. |
<attribute> |
Declares that a previously defined attribute type can appear within the scope of the named <ElementType> element. |
<element> |
Declares that a previously defined element type can appear within the scope of the named <ElementType> element. |
The content of the schema begins with the <AttributeType>
and <ElementType>
declarations of the innermost elements.
<AttributeType name='studentID' dt:type='string' required='yes'/> <ElementType name='name' content='textOnly'/> <ElementType name='GPA' content='textOnly' dt:type='float'/>
The next <ElementType>
declaration is followed by its attribute and child elements. When an element has attributes or child elements, they must be included this way in its <ElementType>
declaration. They must also be previously declared in their own <ElementType>
or <AttributeType>
declaration.
<ElementType name='student' content='mixed'> <attribute type='studentID'/> <element type='name'/> <element type='GPA'/> </ElementType>
This process is continued throughout the rest of the schema until every element and attribute has been declared.
Unlike DTDs, XML Schemas allow you to have an open content model, allowing you to do such things as type elements and apply default values without necessarily restricting content.
In the following schema, the <GPA>
element is typed and has an attribute with a default value, but no other nodes are declared within the <student>
element.
<Schema xmlns="urn:schemas-microsoft-com:xml-data" xmlns:dt="urn:schemas-microsoft-com:datatypes"> <AttributeType name="scale" default="4.0"/> <ElementType name="GPA" content="textOnly" dt:type="float"> <attribute type="scale"/> </ElementType> <AttributeType name="studentID"/> <ElementType name="student" content="eltOnly" model="open" order="many"> <attribute type="studentID"/> <element type="GPA"/> </ElementType> </Schema>
The preceding schema allows you to validate only the area with which you are concerned. This gives you more control over the level of validation for your document and allows you to use some of the features provided by the schema without having to employ strict validation.
Try it!
Try authoring a schema for the following XML document.
<order> <customer> <name>Fidelma McGinn</name> <phone_number>425-655-3393</phone_number> </customer> <item> <number>5523918</number> <description>shovel</description> <price>39.99</price> </item> <date_of_purchase>1998-10-23</date_of_purchase> <date_of_delivery>1998-11-03</date_of_delivery> </order>
After you have completed the schema, run it through the XML Validator.
MSDN® Online Downloads provides a set of XML sample files, including an XML document with an accompanying schema. Download these samples to work with the XML document and the schema. To test the validity of your XML against a schema, you can load the document through the XML Validator or simply view the XML file in the MIMETYPE Viewer.
The following are some considerations.
<ElementType>
and<AttributeType>
declarations must precede<element>
and<attribute>
content declarations that refer to these types. For example, in the preceding schema, the<ElementType>
declaration for the<GPA>
element must precede the<ElementType>
declaration for the<student>
element.- The default value of the
order
attribute depends on the value of thecontent
attribute. When the content is set to"eltOnly"
, the order defaults toseq
. When the content is set to"mixed"
, the order defaults tomany
.