Internationalization of applications

The Nagare framework includes a nagare.i18n module that can help you to internationalize your Web application.

There are many aspects to take into account when internationalizing an application:

  • extracting the messages (strings) used inside the application and translating them in each language
  • formatting numbers, dates, times, currencies for display or parsing them after user input
  • dealing with timezone calculations
  • displaying the requested pages in the proper language according to the client browser language settings or a setting in the application

All these aspects are addressed by the nagare.i18n module.

Setup an application for internationalization

First, the Nagare i18n extras should be installed in your virtual environment:

$ easy_install "nagare[i18n]"

This command installs Babel and pytz which are necessary to use the nagare.i18n module.

However, if you plan to use the internationalization feature of Nagare in your Web application, you’d better change the setup.py file that is generated when you create the application, in order to specify that you need the i18n extra as a requirement:

setup(
      ...
      install_requires = ('nagare[i18n]', ...),
      ...
)

Then, Nagare will be installed with the i18n extra when you install your application.

When you create a new application with nagare-admin create-app, Nagare automatically creates a setup.cfg file that contains a default configuration for internationalization:

[extract_messages]
keywords = _ , _N:1,2 , _L , _LN:1,2 , gettext , ugettext , ngettext:1,2 , ungettext:1,2 , lazy_gettext , lazy_ugettext , lazy_ngettext:1,2 , lazy_ungettext:1,2
output_file = data/locale/messages.pot

[init_catalog]
input_file = data/locale/messages.pot
output_dir = data/locale
domain = messages

[update_catalog]
input_file = data/locale/messages.pot
output_dir = data/locale
domain = messages

[compile_catalog]
directory = data/locale
domain = messages

This configuration file is used by the distutils commands that are automatically registered to setuptools when Babel is installed. Here is the list of the commands:

Command Effect
python setup.py extract_messages Extract the messages from the python sources to a .pot file
python setup.py init_catalog -l <lang> Create a new message catalog (.po file) for a given language, initialized from the .pot file
python setup.py update_catalog [-l <lang>] Update the message catalog(s) with the new messages found in the .pot file after an extract_messages
python setup.py compile_catalog [-l <lang>] Compile the message catalog(s) in binary form so that they can be used inside the application

See the Babel setup documentation for a complete reference of the commands options.

The catalog files are stored in the data/locale directory by default.

Marking the messages to translate

The nagare.i18n module provides functions to translate the strings that appear in your code. The most useful function is nagare.i18n._ which returns the translation of the given string as a unicode string:

from nagare import presentation
from nagare.i18n import _

class I18nExample(object):
    pass

@presentation.render_for(I18nExample)
def render(self, h, *args):
    h << _('This is a translated string') << h.br
    h << 'This string is not translated'
    return h.root

app = I18nExample

There are also many more functions that deals with strings in the nagare.i18n module:

Function Effect
_ Shortcut to ugettext
_N Shortcut to ungettext
_L Shortcut to lazy_ugettext
_LN Shortcut to lazy_ungettext
gettext Return the localized translation of a message as a 8-bit string encoded with the catalog’s charset encoding, based on the current locale set by Nagare
ugettext Return the localized translation of a message as a unicode string, based on the current locale set by Nagare
ngettext Like gettext but consider plurals forms
nugettext Like ugettext but consider plurals forms
lazy_gettext Like gettext but with lazy evaluation
lazy_ugettext Like ugettext but with lazy evaluation
lazy_ngettext Like ngettext but with lazy evaluation
lazy_nugettext Like nugettext but with lazy evaluation

Note that the lazy_* variants don’t fetch the translation when they are evaluated. Instead Nagare automatically takes care of fetching the translation at the rendering time when it encounters a lazy translation. This is especially useful when you want to use translated strings in module or class constants, which are evaluated at definition time and consequently don’t have access to the current locale which, as you will see later, is scoped to the request. Here is how you would define a class constant for use in a view:

from nagare import presentation
from nagare.i18n import _, _L

class I18nExample(object):
    # the following line is evaluated when the module is loaded, so _ would not work here
    TITLE = _L('Internationalization Example')

@presentation.render_for(I18nExample)
def render(self, h, *args):
    with h.h1:
        h << self.TITLE
    h << _('This is a translated string') << h.br
    h << 'This string is not translated'
    return h.root

...

Extracting the messages

Babel automatically takes care of the messages extraction from the sources when you run the python setup.py extract_messages command. Specifically, when it extracts the localizable strings from the sources files, it only considers the strings enclosed in specific keywords calls. The list of the supported keywords appear in the setup.cfg file:

[extract_messages]
keywords = _ , _N:1,2 , _L , _LN:1,2 , gettext , ugettext , ngettext:1,2 , ungettext:1,2 , lazy_gettext , lazy_ugettext , lazy_ngettext:1,2 , lazy_ungettext:1,2

All the nagare.i18n functions that deal with strings appear in this list, so you don’t have to change this line in most cases. However, if you need to add additional translation functions, don’t forget to update the keywords list in this file.

Furthermore, the setup.py file created by nagare-admin create-app contains a line that configures the messages extractors that Babel will use in order to extract the messages from your source files:

setup(
      ...
      message_extractors = { 'i18n_example' : [('**.py', 'python', None)] },
      ...
)

By default, Nagare set up an extractor for the python files. If you have other files that contain localizable strings (such as .js files), you’ll have to add additional message extractors by yourself. Please refer the Babel messages extractor documentation to understand how to add additional message extractors.

If you run the python setup.py extract_messages command on the previous code, you should obtain a messages.pot file that looks like this one:

# Translations template for nagare.examples.
# Copyright (C) 2012 ORGANIZATION
# This file is distributed under the same license as the nagare.examples
# project.
# FIRST AUTHOR <EMAIL@ADDRESS>, 2012.
#
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: nagare.examples 0.3.0\n"
"Report-Msgid-Bugs-To: EMAIL@ADDRESS\n"
"POT-Creation-Date: 2012-03-05 11:05+0100\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=utf-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Generated-By: Babel 0.9.6\n"

#: nagare/examples/i18n.py:18
msgid "Internationalization Example"
msgstr ""

#: nagare/examples/i18n.py:25
msgid "This is a translated string"
msgstr ""

All the localizable strings of your application should appear in this file after extraction. Note that the string This string is not translated of the previous example was not included because it was not enclosed in a call to a translation keyword.

Translating the messages

Once the .pot file is produced (after message extraction), you have to create a message catalog for each language with the help of the init_catalog catalog, for example:

$ python setup.py init_catalog -l fr
$ python setup.py init_catalog -l en

Then update the .po files that have just been created and provide the translations for all the strings that do not have one. You can use either a dedicated .po file editor, or a simple text editor and fill the msgstr lines for each msgid. Please refer to the PO Files section of the Gettext Manual for more information about this process.

Finally, compile the catalogs in binary form:

$ python setup.py compile_catalog

If you change the source files and add new messages to translate, you have to update the message catalogs:

$ python setup.py extract_messages
$ python setup.py update_catalog

Then, fill in the missing translations in the .po files and compile the catalogs again.

$ python setup.py compile_catalog

Formatting numbers, dates, times, currencies and dealing with timezones

The nagare.i18n module also defines functions for formatting numbers, dates, times, currencies and for timezones calculations, according to the locale of the user. We have grouped these functions by category. See the Nagare nagare.i18n API documentation for more information on these functions.

Numbers

Function Effect
get_decimal_symbol Return the symbol used to separate decimal fractions
get_plus_sign_symbol Return the plus sign symbol
get_minus_sign_symbol Return the minus sign symbol
get_exponential_symbol Return the symbol used to separate mantissa and exponent
get_group_symbol Return the symbol used to separate groups of thousands
format_number Return the given integer number formatted
format_decimal Return the given decimal number formatted
format_percent Return the formatted percentage value
format_scientific Return the value formatted in scientific notation, e.g. 1.23E06
parse_number Parse the localized number string into a long integer
parse_decimal Parse the localized decimal string into a float

Currencies

Function Effect
get_currency_name Return the name used for the specified currency
get_currency_symbol Return the symbol used for the specified currency
format_currency Return the formatted currency value

Dates & timezones

Function Effect
get_period_names Return the names for day periods (AM/PM)
get_day_names Return the day names in the specified format
get_month_names Return the month names in the specified format
get_quarter_names Return the quarter names in the specified format
get_era_names Return the era names used in the specified format
get_date_format Return the date formatting pattern in the specified format
get_datetime_format Return the datetime formatting pattern in the specified format
get_time_format Return the time formatting pattern in the specified format
get_timezone_gmt Return the timezone associated with the given datetime object, formatted as a string indicating the offset from GMT
get_timezone_location Return a representation of the given timezone using the “location format”, i.e. the human-readable name of the timezone
get_timezone_name Return the localized display name for the given timezone
format_time Return a time formatted according to the given pattern
format_date Return a date formatted according to the given pattern
format_datetime Return a datetime formatted according to the given pattern
parse_time Parse a time from a string
parse_date Parse a date from a string
to_timezone Return a localized datetime object
to_utc Return a UTC datetime object

Setting the locale of the application

The last step of the internationalization process is to initialize the locale of the application according to a language setting. There are many ways to implement a language setting:

  • use a per-application setting, for example provide the language setting in the application configuration
  • use the client browser language settings, i.e. deal with the Accept-Language headers from the HTTP request
  • use a URL prefix, for example /en for english and /fr for french (see Significative “RESTful” URLs to learn how to set up URLs)
  • use a per-user setting: for example, read the locale setting from a user profile stored in a database
  • use a per-session setting, for example store the language setting as an attribute of a component file

You may even want to combine two or more of these schemes in your application! As you will see next, Nagare makes it easy to change the locale at runtime, it doesn’t really matter where the language setting comes from.

The locale can be changed with either the nagare.wsgi.WSGIApp.set_default_locale method or the nagare.wsgi.WSGIApp.set_locale method.

Example 1: application-wide locale

Use the set_default_locale() if the locale if only set once for the application:

from nagare import wsgi, 18n

class I18nExampleApp(wsgi.WSGIApp):
    def __init__(self, root_factory):
        super(I18nExample, self).__init__(root_factory)

        self.set_default_locale(i18n.Locale('fr'))  # Hard coded locale

or:

from nagare import wsgi, 18n

class I18nExampleApp(wsgi.WSGIApp):
    def set_config(self, config_filename, config, error):
        super(I18nExample, self).set_config(config_filename, config, error)

        language = ...  # Read the langage from the ``config`` configuration
        self.set_default_locale(i18n.Locale(language))


 app = I18nExampleApp(lambda: component.Component(I18nExample()))

Example 2: using a negociated browser locale

Use the set_locale() method if the locale can change between requests.

The best place where to change the locale is in the nagare.wsgi.WSGIApp.start_request method since we have access to both the request and to the authenticated user (when necessary) there. Furthermore, since a user may change his language setting between two requests, the locale setting is scoped to the HTTP request by design in Nagare.

As an illustration, let’s use a negociated browser locale:

from nagare import presentation, component, wsgi, i18n

...

class I18nExampleApp(wsgi.WSGIApp):
    AVAILABLE_LANGUAGES = (
        ('en',),  # English
        ('fr',),  # French
    )

    def start_request(self, root, request, response):
        super(I18nExampleApp, self).start_request(root, request, response)

        # Set the default locale to a browser negotiated locale
        locale = i18n.NegotiatedLocale(
            request,
            self.AVAILABLE_LANGUAGES,
            default_locale=self.AVAILABLE_LANGUAGES[0]
        )
        self.set_locale(locale)


app = I18nExampleApp(lambda: component.Component(I18nExample()))

We first create an application class that inherit from wsgi.WSGIApp so that we can override the default start_request behavior in order to install a negociated browser locale.

Since we must pass the list of the available locales to the i18n.NegotiatedLocale constructor, we define a AVAILABLE_LANGUAGES class constant. The available locales should be given as a list of (language, territory) tuples. The territory has been omitted here since we don’t use territory variants of the languages yet.

The nagare.i18n.NegotiatedLocale constructor accepts the request as first parameter, the list of the available locales as second parameter and some optional parameters. It examines the Accept-Language headers of the request in order to determine the best locale to use for the current request according to the available locales. The default_locale optional parameter can be used to define the default (language, territory) value when the negociation fails.

Example 3: using a per-session locale setting

Let’s show another example: we will allow the user to change the language setting using links and store the setting in the session in order to persist the change for the next requests. Here is the code:

# Encoding: utf-8
from nagare import presentation, component, wsgi, i18n, var
from nagare.i18n import _, _L


class I18nExample(object):
    # The following line is evaluated when the module is loaded, so _ would not work here
    TITLE = _L('Internationalization Example')

    AVAILABLE_LANGUAGES = (
        ('en', u'English'),  # English
        ('fr', u'Français'),  # French
    )

    def __init__(self):
        self.language = self.AVAILABLE_LANGUAGES[0][0]

    def change_language(self, language):
        self.language = language

        # We will use the same directory than the current locale to
        # locate the translation files
        dirname = i18n.get_locale().get_translation_directory()

        # Immediatly change the locale of the current request
        i18n.set_locale(i18n.Locale(language, dirname=dirname))


@presentation.render_for(I18nExample)
def render(self, h, comp, *args):
    with h.h1:
        h << self.TITLE
    h << _('This is a translated string') << h.br
    h << 'This string is not translated'

    h << h.hr

    # Language change
    for language, label in self.AVAILABLE_LANGUAGES:
        h << h.a(label, title=label).action(self.change_language, language)
        h << ' '

    return h.root


class I18nExampleApp(wsgi.WSGIApp):
    def start_request(self, root, request, response):
        super(I18nExampleApp, self).start_request(root, request, response)

        # Install the locale from the language stored in the root object
        language = root().language
        self.set_locale(i18n.Locale(language))


app = I18nExampleApp(lambda: component.Component(I18nExample()))

We moved the AVAILABLE_LANGUAGES list to the I18nExample component since we’ll show a list of language change links in its view.

In the associated view, we iterate the AVAILABLE_LANGUAGES list in order to display a link for each language. The links actions call the set_language method, passing the associated language code to it.

The I18nExample.set_language method performs two operations. First, it remembers the language setting as an attribute. Since the component will be pickled into the session after the request, the language setting is effectively stored in the session and will be back in the next request when the session is unpickled. The language setting is then used in the I18nExampleApp.start_request method for reinstalling the locale for subsequent requests.

Second, it changes the locale in the current request so that the rendering of the component is immediately affected by the change. However, we can’t call I18nExampleApp.set_locale here, because we don’t have access to the application instance from the component. Instead, we have to install the locale ourselves by calling the i18n.set_locale directly.

These three examples explain the basics for setting the locale in a Nagare application. Real-world implementations may store the language setting by other means (see the list at the top-most paragraph of this section) and use a combination of these two techniques.