@c -*-texinfo-*- @c This is part of the GNU Guile Reference Manual. @c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2006 @c Free Software Foundation, Inc. @c See the file guile.texi for copying conditions. @page @node Internationalization @section Support for Internationalization @cindex internationalization @cindex i18n Guile provides internationalization support for Scheme programs in two ways. First, procedures to manipulate text and data in a way that conforms to particular cultural conventions (i.e., in a ``locale-dependent'' way) are provided in the @code{(ice-9 i18n)}. Second, Guile allows the use of GNU @code{gettext} to translate program message strings. @menu * The ice-9 i18n Module:: Honoring cultural conventions. * Gettext Support:: Translating message strings. @end menu @node The ice-9 i18n Module @subsection The @code{(ice-9 i18n)} Module In order to make use of the following functions, one must import the @code{(ice-9 i18n)} module in the usual way: @example (use-modules (ice-9 i18n)) @end example @cindex libguile-i18n-v-@value{LIBGUILE_I18N_MAJOR} C programs can use the C functions corresponding to the procedures of this module by including @code{} and by linking against @code{libguile-i18n-v-@value{LIBGUILE_I18N_MAJOR}}. @cindex cultural conventions The @code{(ice-9 i18n)} module provides procedures to manipulate text and other data in a way that conforms to the cultural conventions chosen by the user. Each region of the world or language has its own customs to, for instance, represent real numbers, classify characters, collate text, etc. All these aspects comprise the so-called ``cultural conventions'' of that region or language. @cindex locale @cindex locale category Computer systems typically refer to a set of cultural conventions as a @dfn{locale}. For each particular aspect that comprise those cultural conventions, a @dfn{locale category} is defined. For instance, the way characters are classified is defined by the @code{LC_CTYPE} category, while the language in which program messages are issued to the user is defined by the @code{LC_MESSAGES} category (@pxref{Locales, General Locale Information} for details). @cindex locale object The procedures provided by this module allow the development of programs that adapt automatically to any locale setting. As we will see later, many of the locale-dependent procedures provided by this module can optionally take a @dfn{locale object} argument. This additional argument defines the locale settings that must be followed by the invoked procedure. When it is omitted, then the current locale settings of the process are followed (@pxref{Locales, @code{setlocale}}). The following procedures allow the manipulation of such locale objects. @deffn {Scheme Procedure} make-locale category-mask locale-name [base-locale] @deffnx {C Function} scm_make_locale (category_mask, locale_name, base_locale) Return a reference to a data structure representing a set of locale datasets. @var{locale-name} should be a string denoting a particular locale, e.g., @code{"aa_DJ"}. Unlike for the @var{category} parameter for @code{setlocale}, the @var{category-mask} parameter here uses a single bit for each category, made by OR'ing together @code{LC_*_MASK} bits. The optional @var{base-locale} argument can be used to specify a locale object whose settings are to be used as a basis for the locale object being returned. The available locale category masks are the following: @defvar LC_COLLATE_MASK Represents the collation locale category. @end defvar @defvar LC_CTYPE_MASK Represents the character classification locale category. @end defvar @defvar LC_MESSAGES_MASK Represents the messages locale category. @end defvar @defvar LC_MONETARY_MASK Represents the monetary locale category. @end defvar @defvar LC_NUMERIC_MASK Represents the way numbers are displayed. @end defvar @defvar LC_TIME_MASK Represents the way date and time are displayed @end defvar The following category masks are also available but will not have any effect on systems that do not support them: @defvar LC_PAPER_MASK @defvarx LC_NAME_MASK @defvarx LC_ADDRESS_MASK @defvarx LC_TELEPHONE_MASK @defvarx LC_MEASUREMENT_MASK @defvarx LC_IDENTIFICATION_MASK @end defvar Finally, there is also: @defvar LC_ALL_MASK This represents all the locale categories supported by the system. @end defvar The @code{LC_*_MASK} variables are bound to integers which may be OR'd together using @code{logior} (@pxref{Primitive Numerics, @code{logior}}). For instance, the following invocation creates a locale object that combines the use of Esperanto for messages and character classification with the default settings for the other categories (i.e., the settings of the default @code{C} locale which usually represents conventions in use in the USA): @example (make-locale (logior LC_MESSAGE_MASK LC_CTYPE_MASK) "eo_EO") @end example The following example combines the use of Swedish conventions with monetary conventions from Croatia: @example (make-locale LC_MONETARY_MASK "hr_HR" (make-locale LC_ALL_MASK "sv_SE")) @end example A @code{system-error} exception (@pxref{Handling Errors}) is raised by @code{make-locale} when @var{locale-name} does not match any of the locales compiled on the system. Note that on non-GNU systems, this error may be raised later, when the locale object is actually used. @end deffn @deffn {Scheme Procedure} locale? obj @deffnx {C Function} scm_locale_p (obj) Return true if @var{obj} is a locale object. @end deffn The following procedures provide support for text collation. @deffn {Scheme Procedure} string-locale? s1 s2 [locale] @deffnx {C Function} scm_string_locale_gt (s1, s2, locale) Compare strings @var{s1} and @var{s2} in a locale-dependent way. If @var{locale} is provided, it should be locale object (as returned by @code{make-locale}) and will be used to perform the comparison; otherwise, the current system locale is used. @end deffn @deffn {Scheme Procedure} string-locale-ci? s1 s2 [locale] @deffnx {C Function} scm_string_locale_ci_gt (s1, s2, locale) Compare strings @var{s1} and @var{s2} in a case-insensitive, and locale-dependent way. If @var{locale} is provided, it should be locale object (as returned by @code{make-locale}) and will be used to perform the comparison; otherwise, the current system locale is used. @end deffn @deffn {Scheme Procedure} string-locale-ci=? s1 s2 [locale] @deffnx {C Function} scm_string_locale_ci_eq (s1, s2, locale) Compare strings @var{s1} and @var{s2} in a case-insensitive, and locale-dependent way. If @var{locale} is provided, it should be locale object (as returned by @code{make-locale}) and will be used to perform the comparison; otherwise, the current system locale is used. @end deffn @deffn {Scheme Procedure} char-locale? c1 c2 [locale] @deffnx {C Function} scm_char_locale_gt (c1, c2, locale) Return true if character @var{c1} is greater than @var{c2} according to @var{locale} or to the current locale. @end deffn @deffn {Scheme Procedure} char-locale-ci? c1 c2 [locale] @deffnx {C Function} scm_char_locale_ci_gt (c1, c2, locale) Return true if character @var{c1} is greater than @var{c2}, in a case insensitive way according to @var{locale} or to the current locale. @end deffn @deffn {Scheme Procedure} char-locale-ci=? c1 c2 [locale] @deffnx {C Function} scm_char_locale_ci_eq (c1, c2, locale) Return true if character @var{c1} is equal to @var{c2}, in a case insensitive way according to @var{locale} or to the current locale. @end deffn The procedures below provide support for ``character case mapping'', i.e., to convert characters or strings to their upper-case or lower-case equivalent. Note that SRFI-13 provides procedures that look similar (@pxref{Alphabetic Case Mapping}). However, the SRFI-13 procedures are locale-independent. Therefore, they do not take into account specificities of the customs in use in a particular language or region of the world. For instance, while most languages using the Latin alphabet map lower-case letter ``i'' to upper-case letter ``I'', Turkish maps lower-case ``i'' to ``Latin capital letter I with dot above''. The following procedures allow to provide idiomatic character mapping. @deffn {Scheme Procedure} char-locale-downcase chr [locale] @deffnx {C Function} scm_char_locale_upcase (chr, locale) Return the lowercase character that corresponds to @var{chr} according to either @var{locale} or the current locale. @end deffn @deffn {Scheme Procedure} char-locale-upcase chr [locale] @deffnx {C Function} scm_char_locale_downcase (chr, locale) Return the uppercase character that corresponds to @var{chr} according to either @var{locale} or the current locale. @end deffn @deffn {Scheme Procedure} string-locale-upcase str [locale] @deffnx {C Function} scm_string_locale_upcase (str, locale) Return a new string that is the uppercase version of @var{str} according to either @var{locale} or the current locale. @end deffn @deffn {Scheme Procedure} string-locale-downcase str [locale] @deffnx {C Function} scm_string_locale_downcase (str, locale) Return a new string that is the down-case version of @var{str} according to either @var{locale} or the current locale. @end deffn Finally, the following procedures allow programs to read numbers written according to a particular locale. As an example, in English, ``ten thousand and a half'' is usually written @code{10,000.5} while in French it is written @code{10000,5}. These procedures allow to account for these differences. @deffn {Scheme Procedure} locale-string->integer str [base [locale]] @deffnx {C Function} scm_locale_string_to_integer (str, base, locale) Convert string @var{str} into an integer according to either @var{locale} (a locale object as returned by @code{make-locale}) or the current process locale. If @var{base} is specified, then it determines the base of the integer being read (e.g., @code{16} for an hexadecimal number, @code{10} for a decimal number); by default, decimal numbers are read. Return two values: an integer (on success) or @code{#f}, and the number of characters read from @var{str} (@code{0} on failure). @end deffn @deffn {Scheme Procedure} locale-string->inexact str [locale] @deffnx {C Function} scm_locale_string_to_inexact (str, locale) Convert string @var{str} into an inexact number according to either @var{locale} (a locale object as returned by @code{make-locale}) or the current process locale. Return two values: an inexact number (on success) or @code{#f}, and the number of characters read from @var{str} (@code{0} on failure). @end deffn @node Gettext Support @subsection Gettext Support Guile provides an interface to GNU @code{gettext} for translating message strings (@pxref{Introduction,,, gettext, GNU @code{gettext} utilities}). Messages are collected in domains, so different libraries and programs maintain different message catalogues. The @var{domain} parameter in the functions below is a string (it becomes part of the message catalog filename). When @code{gettext} is not available, or if Guile was configured @samp{--without-nls}, dummy functions doing no translation are provided. When @code{gettext} support is available in Guile, the @code{i18n} feature is provided (@pxref{Feature Tracking}). @deffn {Scheme Procedure} gettext msg [domain [category]] @deffnx {C Function} scm_gettext (msg, domain, category) Return the translation of @var{msg} in @var{domain}. @var{domain} is optional and defaults to the domain set through @code{textdomain} below. @var{category} is optional and defaults to @code{LC_MESSAGES} (@pxref{Locales}). Normal usage is for @var{msg} to be a literal string. @command{xgettext} can extract those from the source to form a message catalogue ready for translators (@pxref{xgettext Invocation,, Invoking the @command{xgettext} Program, gettext, GNU @code{gettext} utilities}). @example (display (gettext "You are in a maze of twisty passages.")) @end example @code{_} is a commonly used shorthand, an application can make that an alias for @code{gettext}. Or a library can make a definition that uses its specific @var{domain} (so an application can change the default without affecting the library). @example (define (_ msg) (gettext msg "mylibrary")) (display (_ "File not found.")) @end example @code{_} is also a good place to perhaps strip disambiguating extra text from the message string, as for instance in @ref{GUI program problems,, How to use @code{gettext} in GUI programs, gettext, GNU @code{gettext} utilities}. @end deffn @deffn {Scheme Procedure} ngettext msg msgplural n [domain [category]] @deffnx {C Function} scm_ngettext (msg, msgplural, n, domain, category) Return the translation of @var{msg}/@var{msgplural} in @var{domain}, with a plural form chosen appropriately for the number @var{n}. @var{domain} is optional and defaults to the domain set through @code{textdomain} below. @var{category} is optional and defaults to @code{LC_MESSAGES} (@pxref{Locales}). @var{msg} is the singular form, and @var{msgplural} the plural. When no translation is available, @var{msg} is used if @math{@var{n} = 1}, or @var{msgplural} otherwise. When translated, the message catalogue can have a different rule, and can have more than two possible forms. As per @code{gettext} above, normal usage is for @var{msg} and @var{msgplural} to be literal strings, since @command{xgettext} can extract them from the source to build a message catalogue. For example, @example (define (done n) (format #t (ngettext "~a file processed\n" "~a files processed\n" n) n)) (done 1) @print{} 1 file processed (done 3) @print{} 3 files processed @end example It's important to use @code{ngettext} rather than plain @code{gettext} for plurals, since the rules for singular and plural forms in English are not the same in other languages. Only @code{ngettext} will allow translators to give correct forms (@pxref{Plural forms,, Additional functions for plural forms, gettext, GNU @code{gettext} utilities}). @end deffn @deffn {Scheme Procedure} textdomain [domain] @deffnx {C Function} scm_textdomain (domain) Get or set the default gettext domain. When called with no parameter the current domain is returned. When called with a parameter, @var{domain} is set as the current domain, and that new value returned. For example, @example (textdomain "myprog") @result{} "myprog" @end example @end deffn @deffn {Scheme Procedure} bindtextdomain domain [directory] @deffnx {C Function} scm_bindtextdomain (domain, directory) Get or set the directory under which to find message files for @var{domain}. When called without a @var{directory} the current setting is returned. When called with a @var{directory}, @var{directory} is set for @var{domain} and that new setting returned. For example, @example (bindtextdomain "myprog" "/my/tree/share/locale") @result{} "/my/tree/share/locale" @end example When using Autoconf/Automake, an application should arrange for the configured @code{localedir} to get into the program (by substituting, or by generating a config file) and set that for its domain. This ensures the catalogue can be found even when installed in a non-standard location. @end deffn @deffn {Scheme Procedure} bind-textdomain-codeset domain [encoding] @deffnx {C Function} scm_bind_textdomain_codeset (domain, encoding) Get or set the text encoding to be used by @code{gettext} for messages from @var{domain}. @var{encoding} is a string, the name of a coding system, for instance @nicode{"8859_1"}. (On a Unix/POSIX system the @command{iconv} program can list all available encodings.) When called without an @var{encoding} the current setting is returned, or @code{#f} if none yet set. When called with an @var{encoding}, it is set for @var{domain} and that new setting returned. For example, @example (bind-textdomain-codeset "myprog") @result{} #f (bind-textdomain-codeset "myprog" "latin-9") @result{} "latin-9" @end example The encoding requested can be different from the translated data file, messages will be recoded as necessary. But note that when there is no translation, @code{gettext} returns its @var{msg} unchanged, ie.@: without any recoding. For that reason source message strings are best as plain ASCII. Currently Guile has no understanding of multi-byte characters, and string functions won't recognise character boundaries in multi-byte strings. An application will at least be able to pass such strings through to some output though. Perhaps this will change in the future. @end deffn @c Local Variables: @c TeX-master: "guile.texi" @c ispell-local-dictionary: "american" @c End: