Localization and your MSI file.
Jenny signed off on my Monday night blog hours so I'm curled up in my big comfy chair with my laptop ready to discuss some details of the Windows Installer. Honestly, every time I sit in this big chair I consider writing that book again. But not tonight. For tonight we talk about localization and the Windows Installer.
Before I get started, I want to throw out a very important disclaimer up front. I am not a localization expert and I personally have never fully-localized a product. Most of what I'm presenting here is information that I've gleaned from talking to or just watching localizers. The rest of it I stole from the Windows Installer SDK.
For those of you who have not been indoctrinated in building global software, know that "localization" is the process of making your software available for other "locations" (or locales). "Localizers" are the individuals responsible for localizing your software. Obviously most "localizers" speak or read or comprehend more than one culture. This particular talent is one of the major reasons I make a really lousy localizer. I only really understand American English, C/C++, C#, VBScript and a bit of Australian English (from living with Peter for a few years). But I digress.
Most people think localization is all about translating the text in their program from one language to another. While this is an important part of the process, there are many other facets of localization. For example, directly related to the text translation process but often neglected is the planning for translated text to take more (or less) space in dialog boxes than the first language did. I remember a localizer mentioning to me once that--in general--a dialog box for German text needs to be somewhere around 1.5 times larger than a dialog box with the same text in English. Another important facet of localization is adjusting text and images to be geopolitically appropriate. Words and images accepted in one part of the world are not always appropriate for other parts of the world. Thus it is important to understand the cultures not just the languages when localizing software.
Okay, so that is probably enough to cover the "Localization" part of this blog entry's title, now let's move on "your MSI file". For the remainder of this blog entry, unless I specifically mention it, the term "MSI file" will be synonymous with "Windows Installer database" (which includes not only MSI files but Merge Modules [.msm files] as well). So what we're really talking about here is localizing your <Products/> and <Modules/> if you use the WiX toolset.
As promised in the beginning, much of what I'm covering here is covered in what I consider the "Windows Installer Bible", the Windows Installer SDK. When I have questions about the way the Window Installer works, I go back and refer to that documentation. Fortunately, for me and this blog, the Windows Installer SDK can get kinda' cryptic at times. So I'm here to add more words to what already exists.
In particular, the Localization Overview in the Windows Installer SDK is a great place to start (and I expect I will refer to it several times in this blog entry). That help topic does a pretty good job providing a check-list of things to do when localizing your MSI file. I particularly like the first step, plan for localization.
I am fully aware that many people save setup for the end of the product cycle. I personally believe this practice is very reckless and ill-advised (especially consider there are now tools like the WiX toolset that can integrate directly into your build process). However, I've also noticed that localization is often considered after setup! That doubles down the trouble because right when you need to lock-in the Componentization your product you're going to be adding more files. Bad planning can make this a horrible chore.
So, here's my standard template for success with localization. First, break out all of the localizable text in your product into a separate resource-only DLL. Second, put that resource-only DLL in a sub-directory named for the language of the resource-only DLL. Keep the name of the resource-only DLL the same though. Since I'm an old Office guy, I usually use the LCID (1033 is American English) for the directory name but I've seen the trend towards using the ISO locale names (en-us is American English) since the Common Language Runtime goes that way. Third, store the default installation's language in your per-application data store. For example, an HKLM registry key is an okay store if your product was installed per-machine and an HKU registry key would be okay if your product was installed per-user. The Registry/@Root="HKMU" value in Windows Installer XML syntax was designed for this type of scenario. Fourth, store the user's current language preference in a per-user data store (a per-user registry key works okay) when it differs from the installed value. Finally, when you boot your application load the resource-only DLL from the appropriate sub-directory based on the per-user key if it exists and the per-application key if it does not exist.
You can also do more interesting scenarios if you want to take the system's current language into account, but that's for people that know more about localization that I do. Of course, you should also have some fallback plan if your registry keys are all busted. For example, if the per-application registry key was deleted you could repair a portion of your product. I'll try tossing more advanced scenarios in a later blog entry.
Now that you have a plan for your product's organization, you can go back to thinking about your MSI file. First of all, you'll want to think about the codepage for your MSI file. Remember MSI files are not Unicode. That means if you pick the wrong codepage for your MSI file that you'll get square boxes or question marks showing up in your database. The Windows Installer SDK talks about setting the codepage but if you use the WiX toolset you only need to specify your codepage LCID in the Product/@Codepage or Module/@Codepage attribute for your MSI or MSM file respectively. If you're curious, you can see that the WiX toolset does exactly what the Windows installer SDK says in Binder.cs in the Binder's SetDatabaseCodepage() method.
Also note that because the MSI files are not Unicode files they cannot be truly multi-lingual. This is okay because if we skip over the Overview's steps 3 and 4 (I'll discuss those more later), we see in step 5 that the MSI file has a ProductLanguage Property that must be set to the LCID of the product. That ProductLanguage Property maps to the Product/@Language or Module/@Language attributes. One of the bugs I discussed in my previous blog entry is enabling the localization of those attributes.
I'm going to lump in the Overview's steps 5, 6, 7, and 8 under the heading "Things That Identify Your Product as a Unique Identity in the World." The world is a scary place out there with lots of other products to collide and otherwise get lost in. Make sure you follow all of these steps to ensure that you can find your product when it comes time to patch or upgrade. I've seen a few cases where setup developers were being particularly lazy and thought they could skip over some of these steps. They showed up a few months after their products shipped asking how they can target the appropriate localized version of their MSI file.
In one particular case, a product had some geo-political issue preventing the product from being allowed across the border of some country. I don't remember all of the finer details but I believe the final recommendation was to build a new "politically correct" MSI package, send that off for manufacturing and eat the cost of the thousands of CDs that had already been stamped. I can only imagine how much fun they had creating coasters in their microwave (or how much explaining they had to do with their "higher-ups").
I was originally going to toss the Overview's step 9 in with the above, but (as many of you know) the Component Rules are very near and dear to my heart. Follow my standard template for organizing your product and you'll naturally have to put each resource-only DLL in its own Component. More importantly you will only need to put the non-localized executables in their own Components. If you had localized text in the executables, you'd be in the unfortunate position of needing to create different Components to install the different language versions of the file even though they would all get installed to the same location (a Component Rule violation). Also, as mentioned above, with this product layout I'll be able to demonstrate some interesting advanced install tricks.
Finally, all that is left now is the real localization work (steps 3 and 4 from the Overview). I know this is something of a let down after such a long blog entry, but I'm going to save the details of how the WiX toolset can help with the localization process for later. My hope is to re-finish (er, "un-break") the localization features in the WiX toolset tomorrow night. Then I will try to write the step-by-step process using the WiX toolset using a really simple example. Some time after that, I'll create a more complicated example that takes advantage of some of those advanced tricks I was talking about.
Until next time, take a look at this localization example from the Windows Installer SDK, and keep coding. You know I am.
Copyright © Rob Mensching