6.28.3 Internationalizing your programs and modules

Python 2.4

6.28.3 Internationalizing your programs and modules

Internationalization (I18N) refers to the operation by which a program is made aware of multiple languages. Localization (L10N) refers to the adaptation of your program, once internationalized, to the local language and cultural habits. In order to provide multilingual messages for your Python programs, you need to take the following steps:

  1. prepare your program or module by specially marking translatable strings
  2. run a suite of tools over your marked files to generate raw messages catalogs
  3. create language specific translations of the message catalogs
  4. use the gettext module so that message strings are properly translated

In order to prepare your code for I18N, you need to look at all the strings in your files. Any string that needs to be translated should be marked by wrapping it in _('...') -- that is, a call to the function _(). For example:

filename = 'mylog.txt'
message = _('writing a log message')
fp = open(filename, 'w')
fp.write(message)
fp.close()

In this example, the string 'writing a log message' is marked as a candidate for translation, while the strings 'mylog.txt' and 'w' are not.

The Python distribution comes with two tools which help you generate the message catalogs once you've prepared your source code. These may or may not be available from a binary distribution, but they can be found in a source distribution, in the Tools/i18n directory.

The pygettext6.4 program scans all your Python source code looking for the strings you previously marked as translatable. It is similar to the GNU gettext program except that it understands all the intricacies of Python source code, but knows nothing about C or C++ source code. You don't need GNU gettext unless you're also going to be translating C code (such as C extension modules).

pygettext generates textual Uniforum-style human readable message catalog .pot files, essentially structured human readable files which contain every marked string in the source code, along with a placeholder for the translation strings. pygettext is a command line script that supports a similar command line interface as xgettext; for details on its use, run:

pygettext.py --help

Copies of these .pot files are then handed over to the individual human translators who write language-specific versions for every supported natural language. They send you back the filled in language-specific versions as a .po file. Using the msgfmt.py6.5 program (in the Tools/i18n directory), you take the .po files from your translators and generate the machine-readable .mo binary catalog files. The .mo files are what the gettext module uses for the actual translation processing during run-time.

How you use the gettext module in your code depends on whether you are internationalizing your entire application or a single module.


Footnotes

François Pinard has written a program called xpot which does a similar job. It is available as part of his po-utils package at http://www.iro.umontreal.ca/contrib/po-utils/HTML/.
msgfmt.py is binary compatible with GNU msgfmt except that it provides a simpler, all-Python implementation. With this and pygettext.py, you generally won't need to install the GNU gettext package to internationalize your Python applications.


See About this document... for information on suggesting changes.