Windows Installer XML Overview
Introduction
Windows Installer XML, or WiX, provides a schema that describes a Windows Installer database (MSI or MSM), as well as tools to convert the XML description files into a usable database. The second version of the schema, wix.xsd, adds extra content to ease the creation of multiple Windows Installer databases from a single set of XML documents. The WiX tools model the traditional compile and link model used to create executables from source code. This document provides a brief introduction how to use the tools to compile and link WiX source code into Windows Installer databases.
Note: This document assumes you have a working knowledge of the Windows Installer database format.
.wxs & .wixobj – Windows Installer Xml Files
A .wxs file is the extension used by all source files in the Windows Installer XML system. These .wxs files are analogous to .cpp files for C++ or .cs files for C#. The .wxs files are preprocessed then compiled into WiX object files which use the extension .wixobj. When all of the source files have been compiled into object files, the linker is used to collect the object files together and create a Windows Installer database. More details on the compiler and linker are provided later in this document.
Structure of .wxs files
All .wxs files are well-formed XML documents that contain a single root element named <Wix/>. The rest of the source file may or may not adhere to the WiX schema before preprocessing. However, after being preprocessed all source files must conform to the WiX schema or they will fail to compile.
The root <Wix/> element can contain at most one of the following two elements as children: <Product/>, <Module/>. However, there can be an unbounded number <Fragment/> elements as children of the root <Wix/> element. When a source file is compiled into an object file, each instance of these elements creates a new section in the object file. Therefore, these three elements are often referred to as section elements.
It is important to note, that there can be only one <Product/> or <Module/> section element per source file because they are compiled into special sections called entry sections. Entry sections are used as starting points in the linking process. Sections, entry sections, and the entire linking process are described in greater detail later in this document.
The children of the section elements define the contents of the Windows Installer database. You’ll recognize <Property/> elements that map to entries in the Property table and a hierarchy of <Directory/> elements that build up the Directory table. Most elements contain an “Id” attribute that will act as the primary key for the resulting row in the Windows Installer database. Note, in the first release of the WiX schema the primary key was represented by the text of the element. This location for the primary key was undesirable for several reasons and has been moved to the “Id” attribute. In most cases, the “Id” attribute also defines a symbol when the source file is compiled into an object file.
Symbols and references
Every symbol in an object file is composed of the element name plus the unique identifier from the “Id” attribute. Symbols are important because they can be referenced by other sections from any source file. For example, a <Directory/> structure can be defined in a <Fragment/> in one source file and a <Component/> can be defined under a different source file’s <Fragment/>. By making the <DirectoryRef/> element a parent of the <Component/> an explicit reference is created that references the symbol defined by a <Directory/> in the first source file. The linker is then responsible for stitching the symbol and the reference together in a single Windows Installer database. In some cases, implicit references are generated by the compiler while processing a source file. These implicit references behave identically to explicit references.
In addition to the simple references described above, WiX supports specific complex references. Complex references are used in cases where the linker must generate extra information to link the symbol and reference together. The perfect example of a complex reference is in the Windows Installer’s Feature/Component relationship. When a <Component/> is referenced explicitly by a <Feature/> through a <ComponentRef/> element, the linker must take the <Feature/>’s symbol and the <Component/>’s symbol and add an entry to the FeatureComponents table.
This Feature/Component relationship is even more complex because certain elements in a <Component/>, for example <Shortcut/>, have references back to the primary Feature associated with the Component. These references from a child element of a <Component/> are called reverse references or sometimes feature backlinks. Processing complex references and reverse references is probably the most difficult work the linker has to do.
Note the process of defining and referencing symbols is new to the second version of the WiX toolset. Previously, it was necessary to package Components into Merge Modules and use the merge process to do rudimentary symbol linking. This new system for defining symbols is more flexible, and avoids the overhead of ensuring each Merge Module’s tokens are unique.
Structure of the .wixobj file
A .wixobj file is created by the compiler for each source file compiled. The .wixobj file is an XML document that follows the objects.xsd schema defined in the WiX project. As stated above the .wixobj file contains one or more sections that, in turn, contain symbols and references to other symbols.
While the symbols and references are arguably the most important pieces of data in the .wixobj file, they are rarely the bulk of the information. Instead, the majority of most .wixobj files are composed of <table/>, <row/> and <field/> elements that provide the raw data to be placed in the Windows Installer database. In many cases, the linker will not only process the symbols and references but also use and update the raw data from the .wixobj file.
It is interesting to note that the object file schema, objects.xsd, uses camel casing where the source file schema, wix.xsd, uses Pascal casing. This was a conscious choice to indicate that the object files are not intended to be edited by the user. In fact, all schemas that defines data to be processed only by the WiX tools use camel casing.
candle – Windows Installer XML Compiler
Windows Installer XML compiler is exposed by candle.exe. candle is responsible for preprocessing the input .wxs files into valid well-formed XML documents against the WiX schema, wix.xsd. Then, each post-processed source file is compiled into a .wixobj file.
The compilation process is relatively straight forward. The WiX schema lends itself to a simple recursive descent parser. The compiler processes each element in turn creating new symbols, calculating the necessary references and generating the raw data for the .wixobj file.
The second version of candle is not significantly different from the first implementation. Any changes were either made to enable the new symbol/reference linking or based on feedback from customers. Some of the differences between versions include: the new object file format is XML instead of MSI, modularization of primary keys now happens at link time, and binary streams are imported at link time.
light – Windows Installer XML Linker
The Windows Installer XML linker is exposed by light.exe. light is responsible for processing one or more .wixobj files, retrieving metadata from various external files and creating a Windows Installer database (MSI or MSM). When necessary, light will also create cabinets and embed streams in the created Windows Installer database.
The linker begins by searching the set of object files provided on the command line to find the entry section. If more than one entry section is found, light fails with an error. This failure is necessary because the entry section defines what type of Windows Installer database is being created, a MSI (<Product/>) or MSM (<Module/>). It is not possible to create two databases from a single link operation.
While the linker was determining the entry section, the symbols defined in each object file are stored in a symbol table. After the entry section is found, the linker attempts to resolve all of the references in the section by finding symbols in the symbol table. When a symbol is found in a different section, the linker recursively attempts to resolve references in the new section. This process of gathering the sections necessary to resolve all of the references continues until all references are satisfied. If a symbol cannot be found in any of the provided object files, the linker aborts processing with an error indicating the undefined symbol.
After all of the sections have been found, complex and reverse references are processed. This processing is where Components and Merge Modules are hooked to their parent Features or, in the case of Merge Modules, Components are added to the ModuleComponents table. The reverse reference processing adds the appropriate Feature identifier to the necessary fields for elements like, Shortcut, Class, and TypeLib.
Once all of the references are resolved, the linker processes all of the rows retrieving the language, version, and hash for referenced files, calculating the media layout, and including the necessary standard actions to ensure a successful installation sequence. This part of the processing typically ends up generating additional rows that get added associated with the entry section to ensure they are included in the final Windows Installer database.
Finally, light works through the mechanics of generating IDT files and importing them into the Windows Installer database. After the database is fully created, the final post processing is done to merge in any Merge Modules and create a cabinet if necessary. The result is a fully functional Windows Installer database.