21.1.2.2 The GNUTranslations class
The gettext module provides one additional class derived from NullTranslations: GNUTranslations. This class overrides _parse() to enable reading GNU gettext format .mo files in both big-endian and little-endian format. It also coerces both message ids and message strings to Unicode.
GNUTranslations parses optional meta-data out of the
translation catalog. It is convention with GNU gettext to
include meta-data as the translation for the empty string. This
meta-data is in RFC 822-style key: value
pairs, and should
contain the Project-Id-Version
key. If the key
Content-Type
is found, then the charset
property is used
to initialize the ``protected'' _charset instance variable,
defaulting to None
if not found. If the charset encoding is
specified, then all message ids and message strings read from the
catalog are converted to Unicode using this encoding. The
ugettext() method always returns a Unicode, while the
gettext() returns an encoded 8-bit string. For the message
id arguments of both methods, either Unicode strings or 8-bit strings
containing only US-ASCII characters are acceptable. Note that the
Unicode version of the methods (i.e. ugettext() and
ungettext()) are the recommended interface to use for
internationalized Python programs.
The entire set of key/value pairs are placed into a dictionary and set as the ``protected'' _info instance variable.
If the .mo file's magic number is invalid, or if other problems occur while reading the file, instantiating a GNUTranslations class can raise IOError.
The following methods are overridden from the base class implementation:
- Look up the message id in the catalog and return the corresponding message string, as an 8-bit string encoded with the catalog's charset encoding, if known. If there is no entry in the catalog for the message id, and a fallback has been set, the look up is forwarded to the fallback's gettext() method. Otherwise, the message id is returned.
-
Equivalent to gettext(), but the translation is returned
in the preferred system encoding, if no other encoding was explicitly
set with set_output_charset().
New in version 2.4.
- Look up the message id in the catalog and return the corresponding message string, as a Unicode string. If there is no entry in the catalog for the message id, and a fallback has been set, the look up is forwarded to the fallback's ugettext() method. Otherwise, the message id is returned.
-
Do a plural-forms lookup of a message id. singular is used as
the message id for purposes of lookup in the catalog, while n is
used to determine which plural form to use. The returned message
string is an 8-bit string encoded with the catalog's charset encoding,
if known.
If the message id is not found in the catalog, and a fallback is specified, the request is forwarded to the fallback's ngettext() method. Otherwise, when n is 1 singular is returned, and plural is returned in all other cases.
New in version 2.3.
-
Equivalent to gettext(), but the translation is returned
in the preferred system encoding, if no other encoding was explicitly
set with set_output_charset().
New in version 2.4.
-
Do a plural-forms lookup of a message id. singular is used as
the message id for purposes of lookup in the catalog, while n is
used to determine which plural form to use. The returned message
string is a Unicode string.
If the message id is not found in the catalog, and a fallback is specified, the request is forwarded to the fallback's ungettext() method. Otherwise, when n is 1 singular is returned, and plural is returned in all other cases.
Here is an example:
n = len(os.listdir('.')) cat = GNUTranslations(somefile) message = cat.ungettext( 'There is %(num)d file in this directory', 'There are %(num)d files in this directory', n) % {'num': n}
New in version 2.3.
See About this document... for information on suggesting changes.