(linenum→info "unix/slp.c:2238")

glibc/2.7/manual/message.texi

    1: @node Message Translation, Searching and Sorting, Locales, Top
    2: @c %MENU% How to make the program speak the user's language
    3: @chapter Message Translation
    4: 
    5: The program's interface with the human should be designed in a way to
    6: ease the human the task.  One of the possibilities is to use messages in
    7: whatever language the user prefers.
    8: 
    9: Printing messages in different languages can be implemented in different
   10: ways.  One could add all the different languages in the source code and
   11: add among the variants every time a message has to be printed.  This is
   12: certainly no good solution since extending the set of languages is
   13: difficult (the code must be changed) and the code itself can become
   14: really big with dozens of message sets.
   15: 
   16: A better solution is to keep the message sets for each language are kept
   17: in separate files which are loaded at runtime depending on the language
   18: selection of the user.
   19: 
   20: The GNU C Library provides two different sets of functions to support
   21: message translation.  The problem is that neither of the interfaces is
   22: officially defined by the POSIX standard.  The @code{catgets} family of
   23: functions is defined in the X/Open standard but this is derived from
   24: industry decisions and therefore not necessarily based on reasonable
   25: decisions.
   26: 
   27: As mentioned above the message catalog handling provides easy
   28: extendibility by using external data files which contain the message
   29: translations.  I.e., these files contain for each of the messages used
   30: in the program a translation for the appropriate language.  So the tasks
   31: of the message handling functions are
   32: 
   33: @itemize @bullet
   34: @item
   35: locate the external data file with the appropriate translations.
   36: @item
   37: load the data and make it possible to address the messages
   38: @item
   39: map a given key to the translated message
   40: @end itemize
   41: 
   42: The two approaches mainly differ in the implementation of this last
   43: step.  The design decisions made for this influences the whole rest.
   44: 
   45: @menu
   46: * Message catalogs a la X/Open::  The @code{catgets} family of functions.
   47: * The Uniforum approach::         The @code{gettext} family of functions.
   48: @end menu
   49: 
   50: 
   51: @node Message catalogs a la X/Open
   52: @section X/Open Message Catalog Handling
   53: 
   54: The @code{catgets} functions are based on the simple scheme:
   55: 
   56: @quotation
   57: Associate every message to translate in the source code with a unique
   58: identifier.  To retrieve a message from a catalog file solely the
   59: identifier is used.
   60: @end quotation
   61: 
   62: This means for the author of the program that s/he will have to make
   63: sure the meaning of the identifier in the program code and in the
   64: message catalogs are always the same.
   65: 
   66: Before a message can be translated the catalog file must be located.
   67: The user of the program must be able to guide the responsible function
   68: to find whatever catalog the user wants.  This is separated from what
   69: the programmer had in mind.
   70: 
   71: All the types, constants and functions for the @code{catgets} functions
   72: are defined/declared in the @file{nl_types.h} header file.
   73: 
   74: @menu
   75: * The catgets Functions::      The @code{catgets} function family.
   76: * The message catalog files::  Format of the message catalog files.
   77: * The gencat program::         How to generate message catalogs files which
   78:                                 can be used by the functions.
   79: * Common Usage::               How to use the @code{catgets} interface.
   80: @end menu
   81: 
   82: 
   83: @node The catgets Functions
   84: @subsection The @code{catgets} function family
   85: 
   86: @comment nl_types.h
   87: @comment X/Open
   88: @deftypefun nl_catd catopen (const char *@var{cat_name}, int @var{flag})
   89: The @code{catgets} function tries to locate the message data file names
   90: @var{cat_name} and loads it when found.  The return value is of an
   91: opaque type and can be used in calls to the other functions to refer to
   92: this loaded catalog.
   93: 
   94: The return value is @code{(nl_catd) -1} in case the function failed and
   95: no catalog was loaded.  The global variable @var{errno} contains a code
   96: for the error causing the failure.  But even if the function call
   97: succeeded this does not mean that all messages can be translated.
   98: 
   99: Locating the catalog file must happen in a way which lets the user of
  100: the program influence the decision.  It is up to the user to decide
  101: about the language to use and sometimes it is useful to use alternate
  102: catalog files.  All this can be specified by the user by setting some
  103: environment variables.
  104: 
  105: The first problem is to find out where all the message catalogs are
  106: stored.  Every program could have its own place to keep all the
  107: different files but usually the catalog files are grouped by languages
  108: and the catalogs for all programs are kept in the same place.
  109: 
  110: @cindex NLSPATH environment variable
  111: To tell the @code{catopen} function where the catalog for the program
  112: can be found the user can set the environment variable @code{NLSPATH} to
  113: a value which describes her/his choice.  Since this value must be usable
  114: for different languages and locales it cannot be a simple string.
  115: Instead it is a format string (similar to @code{printf}'s).  An example
  116: is
  117: 
  118: @smallexample
  119: /usr/share/locale/%L/%N:/usr/share/locale/%L/LC_MESSAGES/%N
  120: @end smallexample
  121: 
  122: First one can see that more than one directory can be specified (with
  123: the usual syntax of separating them by colons).  The next things to
  124: observe are the format string, @code{%L} and @code{%N} in this case.
  125: The @code{catopen} function knows about several of them and the
  126: replacement for all of them is of course different.
  127: 
  128: @table @code
  129: @item %N
  130: This format element is substituted with the name of the catalog file.
  131: This is the value of the @var{cat_name} argument given to
  132: @code{catgets}.
  133: 
  134: @item %L
  135: This format element is substituted with the name of the currently
  136: selected locale for translating messages.  How this is determined is
  137: explained below.
  138: 
  139: @item %l
  140: (This is the lowercase ell.) This format element is substituted with the
  141: language element of the locale name.  The string describing the selected
  142: locale is expected to have the form
  143: @code{@var{lang}[_@var{terr}[.@var{codeset}]]} and this format uses the
  144: first part @var{lang}.
  145: 
  146: @item %t
  147: This format element is substituted by the territory part @var{terr} of
  148: the name of the currently selected locale.  See the explanation of the
  149: format above.
  150: 
  151: @item %c
  152: This format element is substituted by the codeset part @var{codeset} of
  153: the name of the currently selected locale.  See the explanation of the
  154: format above.
  155: 
  156: @item %%
  157: Since @code{%} is used in a meta character there must be a way to
  158: express the @code{%} character in the result itself.  Using @code{%%}
  159: does this just like it works for @code{printf}.
  160: @end table
  161: 
  162: 
  163: Using @code{NLSPATH} allows arbitrary directories to be searched for
  164: message catalogs while still allowing different languages to be used.
  165: If the @code{NLSPATH} environment variable is not set, the default value
  166: is
  167: 
  168: @smallexample
  169: @var{prefix}/share/locale/%L/%N:@var{prefix}/share/locale/%L/LC_MESSAGES/%N
  170: @end smallexample
  171: 
  172: @noindent
  173: where @var{prefix} is given to @code{configure} while installing the GNU
  174: C Library (this value is in many cases @code{/usr} or the empty string).
  175: 
  176: The remaining problem is to decide which must be used.  The value
  177: decides about the substitution of the format elements mentioned above.
  178: First of all the user can specify a path in the message catalog name
  179: (i.e., the name contains a slash character).  In this situation the
  180: @code{NLSPATH} environment variable is not used.  The catalog must exist
  181: as specified in the program, perhaps relative to the current working
  182: directory.  This situation in not desirable and catalogs names never
  183: should be written this way.  Beside this, this behavior is not portable
  184: to all other platforms providing the @code{catgets} interface.
  185: 
  186: @cindex LC_ALL environment variable
  187: @cindex LC_MESSAGES environment variable
  188: @cindex LANG environment variable
  189: Otherwise the values of environment variables from the standard
  190: environment are examined (@pxref{Standard Environment}).  Which
  191: variables are examined is decided by the @var{flag} parameter of
  192: @code{catopen}.  If the value is @code{NL_CAT_LOCALE} (which is defined
  193: in @file{nl_types.h}) then the @code{catopen} function use the name of
  194: the locale currently selected for the @code{LC_MESSAGES} category.
  195: 
  196: If @var{flag} is zero the @code{LANG} environment variable is examined.
  197: This is a left-over from the early days where the concept of the locales
  198: had not even reached the level of POSIX locales.
  199: 
  200: The environment variable and the locale name should have a value of the
  201: form @code{@var{lang}[_@var{terr}[.@var{codeset}]]} as explained above.
  202: If no environment variable is set the @code{"C"} locale is used which
  203: prevents any translation.
  204: 
  205: The return value of the function is in any case a valid string.  Either
  206: it is a translation from a message catalog or it is the same as the
  207: @var{string} parameter.  So a piece of code to decide whether a
  208: translation actually happened must look like this:
  209: 
  210: @smallexample
  211: @{
  212:   char *trans = catgets (desc, set, msg, input_string);
  213:   if (trans == input_string)
  214:     @{
  215:       /* Something went wrong.  */
  216:     @}
  217: @}
  218: @end smallexample
  219: 
  220: @noindent
  221: When an error occurred the global variable @var{errno} is set to
  222: 
  223: @table @var
  224: @item EBADF
  225: The catalog does not exist.
  226: @item ENOMSG
  227: The set/message tuple does not name an existing element in the
  228: message catalog.
  229: @end table
  230: 
  231: While it sometimes can be useful to test for errors programs normally
  232: will avoid any test.  If the translation is not available it is no big
  233: problem if the original, untranslated message is printed.  Either the
  234: user understands this as well or s/he will look for the reason why the
  235: messages are not translated.
  236: @end deftypefun
  237: 
  238: Please note that the currently selected locale does not depend on a call
  239: to the @code{setlocale} function.  It is not necessary that the locale
  240: data files for this locale exist and calling @code{setlocale} succeeds.
  241: The @code{catopen} function directly reads the values of the environment
  242: variables.
  243: 
  244: 
  245: @deftypefun {char *} catgets (nl_catd @var{catalog_desc}, int @var{set}, int @var{message}, const char *@var{string})
  246: The function @code{catgets} has to be used to access the massage catalog
  247: previously opened using the @code{catopen} function.  The
  248: @var{catalog_desc} parameter must be a value previously returned by
  249: @code{catopen}.
  250: 
  251: The next two parameters, @var{set} and @var{message}, reflect the
  252: internal organization of the message catalog files.  This will be
  253: explained in detail below.  For now it is interesting to know that a
  254: catalog can consists of several set and the messages in each thread are
  255: individually numbered using numbers.  Neither the set number nor the
  256: message number must be consecutive.  They can be arbitrarily chosen.
  257: But each message (unless equal to another one) must have its own unique
  258: pair of set and message number.
  259: 
  260: Since it is not guaranteed that the message catalog for the language
  261: selected by the user exists the last parameter @var{string} helps to
  262: handle this case gracefully.  If no matching string can be found
  263: @var{string} is returned.  This means for the programmer that
  264: 
  265: @itemize @bullet
  266: @item
  267: the @var{string} parameters should contain reasonable text (this also
  268: helps to understand the program seems otherwise there would be no hint
  269: on the string which is expected to be returned.
  270: @item
  271: all @var{string} arguments should be written in the same language.
  272: @end itemize
  273: @end deftypefun
  274: 
  275: It is somewhat uncomfortable to write a program using the @code{catgets}
  276: functions if no supporting functionality is available.  Since each
  277: set/message number tuple must be unique the programmer must keep lists
  278: of the messages at the same time the code is written.  And the work
  279: between several people working on the same project must be coordinated.
  280: We will see some how these problems can be relaxed a bit (@pxref{Common
  281: Usage}).
  282: 
  283: @deftypefun int catclose (nl_catd @var{catalog_desc})
  284: The @code{catclose} function can be used to free the resources
  285: associated with a message catalog which previously was opened by a call
  286: to @code{catopen}.  If the resources can be successfully freed the
  287: function returns @code{0}.  Otherwise it return @code{@minus{}1} and the
  288: global variable @var{errno} is set.  Errors can occur if the catalog
  289: descriptor @var{catalog_desc} is not valid in which case @var{errno} is
  290: set to @code{EBADF}.
  291: @end deftypefun
  292: 
  293: 
  294: @node The message catalog files
  295: @subsection  Format of the message catalog files
  296: 
  297: The only reasonable way the translate all the messages of a function and
  298: store the result in a message catalog file which can be read by the
  299: @code{catopen} function is to write all the message text to the
  300: translator and let her/him translate them all.  I.e., we must have a
  301: file with entries which associate the set/message tuple with a specific
  302: translation.  This file format is specified in the X/Open standard and
  303: is as follows:
  304: 
  305: @itemize @bullet
  306: @item
  307: Lines containing only whitespace characters or empty lines are ignored.
  308: 
  309: @item
  310: Lines which contain as the first non-whitespace character a @code{$}
  311: followed by a whitespace character are comment and are also ignored.
  312: 
  313: @item
  314: If a line contains as the first non-whitespace characters the sequence
  315: @code{$set} followed by a whitespace character an additional argument
  316: is required to follow.  This argument can either be:
  317: 
  318: @itemize @minus
  319: @item
  320: a number.  In this case the value of this number determines the set
  321: to which the following messages are added.
  322: 
  323: @item
  324: an identifier consisting of alphanumeric characters plus the underscore
  325: character.  In this case the set get automatically a number assigned.
  326: This value is one added to the largest set number which so far appeared.
  327: 
  328: How to use the symbolic names is explained in section @ref{Common Usage}.
  329: 
  330: It is an error if a symbol name appears more than once.  All following
  331: messages are placed in a set with this number.
  332: @end itemize
  333: 
  334: @item
  335: If a line contains as the first non-whitespace characters the sequence
  336: @code{$delset} followed by a whitespace character an additional argument
  337: is required to follow.  This argument can either be:
  338: 
  339: @itemize @minus
  340: @item
  341: a number.  In this case the value of this number determines the set
  342: which will be deleted.
  343: 
  344: @item
  345: an identifier consisting of alphanumeric characters plus the underscore
  346: character.  This symbolic identifier must match a name for a set which
  347: previously was defined.  It is an error if the name is unknown.
  348: @end itemize
  349: 
  350: In both cases all messages in the specified set will be removed.  They
  351: will not appear in the output.  But if this set is later again selected
  352: with a @code{$set} command again messages could be added and these
  353: messages will appear in the output.
  354: 
  355: @item
  356: If a line contains after leading whitespaces the sequence
  357: @code{$quote}, the quoting character used for this input file is
  358: changed to the first non-whitespace character following the
  359: @code{$quote}.  If no non-whitespace character is present before the
  360: line ends quoting is disable.
  361: 
  362: By default no quoting character is used.  In this mode strings are
  363: terminated with the first unescaped line break.  If there is a
  364: @code{$quote} sequence present newline need not be escaped.  Instead a
  365: string is terminated with the first unescaped appearance of the quote
  366: character.
  367: 
  368: A common usage of this feature would be to set the quote character to
  369: @code{"}.  Then any appearance of the @code{"} in the strings must
  370: be escaped using the backslash (i.e., @code{\"} must be written).
  371: 
  372: @item
  373: Any other line must start with a number or an alphanumeric identifier
  374: (with the underscore character included).  The following characters
  375: (starting after the first whitespace character) will form the string
  376: which gets associated with the currently selected set and the message
  377: number represented by the number and identifier respectively.
  378: 
  379: If the start of the line is a number the message number is obvious.  It
  380: is an error if the same message number already appeared for this set.
  381: 
  382: If the leading token was an identifier the message number gets
  383: automatically assigned.  The value is the current maximum messages
  384: number for this set plus one.  It is an error if the identifier was
  385: already used for a message in this set.  It is OK to reuse the
  386: identifier for a message in another thread.  How to use the symbolic
  387: identifiers will be explained below (@pxref{Common Usage}).  There is
  388: one limitation with the identifier: it must not be @code{Set}.  The
  389: reason will be explained below.
  390: 
  391: The text of the messages can contain escape characters.  The usual bunch
  392: of characters known from the @w{ISO C} language are recognized
  393: (@code{\n}, @code{\t}, @code{\v}, @code{\b}, @code{\r}, @code{\f},
  394: @code{\\}, and @code{\@var{nnn}}, where @var{nnn} is the octal coding of
  395: a character code).
  396: @end itemize
  397: 
  398: @strong{Important:} The handling of identifiers instead of numbers for
  399: the set and messages is a GNU extension.  Systems strictly following the
  400: X/Open specification do not have this feature.  An example for a message
  401: catalog file is this:
  402: 
  403: @smallexample
  404: $ This is a leading comment.
  405: $quote "
  406: 
  407: $set SetOne
  408: 1 Message with ID 1.
  409: two "   Message with ID \"two\", which gets the value 2 assigned"
  410: 
  411: $set SetTwo
  412: $ Since the last set got the number 1 assigned this set has number 2.
  413: 4000 "The numbers can be arbitrary, they need not start at one."
  414: @end smallexample
  415: 
  416: This small example shows various aspects:
  417: @itemize @bullet
  418: @item
  419: Lines 1 and 9 are comments since they start with @code{$} followed by
  420: a whitespace.
  421: @item
  422: The quoting character is set to @code{"}.  Otherwise the quotes in the
  423: message definition would have to be left away and in this case the
  424: message with the identifier @code{two} would loose its leading whitespace.
  425: @item
  426: Mixing numbered messages with message having symbolic names is no
  427: problem and the numbering happens automatically.
  428: @end itemize
  429: 
  430: 
  431: While this file format is pretty easy it is not the best possible for
  432: use in a running program.  The @code{catopen} function would have to
  433: parser the file and handle syntactic errors gracefully.  This is not so
  434: easy and the whole process is pretty slow.  Therefore the @code{catgets}
  435: functions expect the data in another more compact and ready-to-use file
  436: format.  There is a special program @code{gencat} which is explained in
  437: detail in the next section.
  438: 
  439: Files in this other format are not human readable.  To be easy to use by
  440: programs it is a binary file.  But the format is byte order independent
  441: so translation files can be shared by systems of arbitrary architecture
  442: (as long as they use the GNU C Library).
  443: 
  444: Details about the binary file format are not important to know since
  445: these files are always created by the @code{gencat} program.  The
  446: sources of the GNU C Library also provide the sources for the
  447: @code{gencat} program and so the interested reader can look through
  448: these source files to learn about the file format.
  449: 
  450: 
  451: @node The gencat program
  452: @subsection Generate Message Catalogs files
  453: 
  454: @cindex gencat
  455: The @code{gencat} program is specified in the X/Open standard and the
  456: GNU implementation follows this specification and so processes
  457: all correctly formed input files.  Additionally some extension are
  458: implemented which help to work in a more reasonable way with the
  459: @code{catgets} functions.
  460: 
  461: The @code{gencat} program can be invoked in two ways:
  462: 
  463: @example
  464: `gencat [@var{Option}]@dots{} [@var{Output-File} [@var{Input-File}]@dots{}]`
  465: @end example
  466: 
  467: This is the interface defined in the X/Open standard.  If no
  468: @var{Input-File} parameter is given input will be read from standard
  469: input.  Multiple input files will be read as if they are concatenated.
  470: If @var{Output-File} is also missing, the output will be written to
  471: standard output.  To provide the interface one is used to from other
  472: programs a second interface is provided.
  473: 
  474: @smallexample
  475: `gencat [@var{Option}]@dots{} -o @var{Output-File} [@var{Input-File}]@dots{}`
  476: @end smallexample
  477: 
  478: The option @samp{-o} is used to specify the output file and all file
  479: arguments are used as input files.
  480: 
  481: Beside this one can use @file{-} or @file{/dev/stdin} for
  482: @var{Input-File} to denote the standard input.  Corresponding one can
  483: use @file{-} and @file{/dev/stdout} for @var{Output-File} to denote
  484: standard output.  Using @file{-} as a file name is allowed in X/Open
  485: while using the device names is a GNU extension.
  486: 
  487: The @code{gencat} program works by concatenating all input files and
  488: then @strong{merge} the resulting collection of message sets with a
  489: possibly existing output file.  This is done by removing all messages
  490: with set/message number tuples matching any of the generated messages
  491: from the output file and then adding all the new messages.  To
  492: regenerate a catalog file while ignoring the old contents therefore
  493: requires to remove the output file if it exists.  If the output is
  494: written to standard output no merging takes place.
  495: 
  496: @noindent
  497: The following table shows the options understood by the @code{gencat}
  498: program.  The X/Open standard does not specify any option for the
  499: program so all of these are GNU extensions.
  500: 
  501: @table @samp
  502: @item -V
  503: @itemx --version
  504: Print the version information and exit.
  505: @item -h
  506: @itemx --help
  507: Print a usage message listing all available options, then exit successfully.
  508: @item --new
  509: Do never merge the new messages from the input files with the old content
  510: of the output files.  The old content of the output file is discarded.
  511: @item -H
  512: @itemx --header=name
  513: This option is used to emit the symbolic names given to sets and
  514: messages in the input files for use in the program.  Details about how
  515: to use this are given in the next section.  The @var{name} parameter to
  516: this option specifies the name of the output file.  It will contain a
  517: number of C preprocessor @code{#define}s to associate a name with a
  518: number.
  519: 
  520: Please note that the generated file only contains the symbols from the
  521: input files.  If the output is merged with the previous content of the
  522: output file the possibly existing symbols from the file(s) which
  523: generated the old output files are not in the generated header file.
  524: @end table
  525: 
  526: 
  527: @node Common Usage
  528: @subsection How to use the @code{catgets} interface
  529: 
  530: The @code{catgets} functions can be used in two different ways.  By
  531: following slavishly the X/Open specs and not relying on the extension
  532: and by using the GNU extensions.  We will take a look at the former
  533: method first to understand the benefits of extensions.
  534: 
  535: @subsubsection Not using symbolic names
  536: 
  537: Since the X/Open format of the message catalog files does not allow
  538: symbol names we have to work with numbers all the time.  When we start
  539: writing a program we have to replace all appearances of translatable
  540: strings with something like
  541: 
  542: @smallexample
  543: catgets (catdesc, set, msg, "string")
  544: @end smallexample
  545: 
  546: @noindent
  547: @var{catgets} is retrieved from a call to @code{catopen} which is
  548: normally done once at the program start.  The @code{"string"} is the
  549: string we want to translate.  The problems start with the set and
  550: message numbers.
  551: 
  552: In a bigger program several programmers usually work at the same time on
  553: the program and so coordinating the number allocation is crucial.
  554: Though no two different strings must be indexed by the same tuple of
  555: numbers it is highly desirable to reuse the numbers for equal strings
  556: with equal translations (please note that there might be strings which
  557: are equal in one language but have different translations due to
  558: difference contexts).
  559: 
  560: The allocation process can be relaxed a bit by different set numbers for
  561: different parts of the program.  So the number of developers who have to
  562: coordinate the allocation can be reduced.  But still lists must be keep
  563: track of the allocation and errors can easily happen.  These errors
  564: cannot be discovered by the compiler or the @code{catgets} functions.
  565: Only the user of the program might see wrong messages printed.  In the
  566: worst cases the messages are so irritating that they cannot be
  567: recognized as wrong.  Think about the translations for @code{"true"} and
  568: @code{"false"} being exchanged.  This could result in a disaster.
  569: 
  570: 
  571: @subsubsection Using symbolic names
  572: 
  573: The problems mentioned in the last section derive from the fact that:
  574: 
  575: @enumerate
  576: @item
  577: the numbers are allocated once and due to the possibly frequent use of
  578: them it is difficult to change a number later.
  579: @item
  580: the numbers do not allow to guess anything about the string and
  581: therefore collisions can easily happen.
  582: @end enumerate
  583: 
  584: By constantly using symbolic names and by providing a method which maps
  585: the string content to a symbolic name (however this will happen) one can
  586: prevent both problems above.  The cost of this is that the programmer
  587: has to write a complete message catalog file while s/he is writing the
  588: program itself.
  589: 
  590: This is necessary since the symbolic names must be mapped to numbers
  591: before the program sources can be compiled.  In the last section it was
  592: described how to generate a header containing the mapping of the names.
  593: E.g., for the example message file given in the last section we could
  594: call the @code{gencat} program as follow (assume @file{ex.msg} contains
  595: the sources).
  596: 
  597: @smallexample
  598: gencat -H ex.h -o ex.cat ex.msg
  599: @end smallexample
  600: 
  601: @noindent
  602: This generates a header file with the following content:
  603: 
  604: @smallexample
  605: #define SetTwoSet 0x2   /* ex.msg:8 */
  606: 
  607: #define SetOneSet 0x1   /* ex.msg:4 */
  608: #define SetOnetwo 0x2   /* ex.msg:6 */
  609: @end smallexample
  610: 
  611: As can be seen the various symbols given in the source file are mangled
  612: to generate unique identifiers and these identifiers get numbers
  613: assigned.  Reading the source file and knowing about the rules will
  614: allow to predict the content of the header file (it is deterministic)
  615: but this is not necessary.  The @code{gencat} program can take care for
  616: everything.  All the programmer has to do is to put the generated header
  617: file in the dependency list of the source files of her/his project and
  618: to add a rules to regenerate the header of any of the input files
  619: change.
  620: 
  621: One word about the symbol mangling.  Every symbol consists of two parts:
  622: the name of the message set plus the name of the message or the special
  623: string @code{Set}.  So @code{SetOnetwo} means this macro can be used to
  624: access the translation with identifier @code{two} in the message set
  625: @code{SetOne}.
  626: 
  627: The other names denote the names of the message sets.  The special
  628: string @code{Set} is used in the place of the message identifier.
  629: 
  630: If in the code the second string of the set @code{SetOne} is used the C
  631: code should look like this:
  632: 
  633: @smallexample
  634: catgets (catdesc, SetOneSet, SetOnetwo,
  635:          "   Message with ID \"two\", which gets the value 2 assigned")
  636: @end smallexample
  637: 
  638: Writing the function this way will allow to change the message number
  639: and even the set number without requiring any change in the C source
  640: code.  (The text of the string is normally not the same; this is only
  641: for this example.)
  642: 
  643: 
  644: @subsubsection How does to this allow to develop
  645: 
  646: To illustrate the usual way to work with the symbolic version numbers
  647: here is a little example.  Assume we want to write the very complex and
  648: famous greeting program.  We start by writing the code as usual:
  649: 
  650: @smallexample
  651: #include <stdio.h>
  652: int
  653: main (void)
  654: @{
  655:   printf ("Hello, world!\n");
  656:   return 0;
  657: @}
  658: @end smallexample
  659: 
  660: Now we want to internationalize the message and therefore replace the
  661: message with whatever the user wants.
  662: 
  663: @smallexample
  664: #include <nl_types.h>
  665: #include <stdio.h>
  666: #include "msgnrs.h"
  667: int
  668: main (void)
  669: @{
  670:   nl_catd catdesc = catopen ("hello.cat", NL_CAT_LOCALE);
  671:   printf (catgets (catdesc, SetMainSet, SetMainHello,
  672:                    "Hello, world!\n"));
  673:   catclose (catdesc);
  674:   return 0;
  675: @}
  676: @end smallexample
  677: 
  678: We see how the catalog object is opened and the returned descriptor used
  679: in the other function calls.  It is not really necessary to check for
  680: failure of any of the functions since even in these situations the
  681: functions will behave reasonable.  They simply will be return a
  682: translation.
  683: 
  684: What remains unspecified here are the constants @code{SetMainSet} and
  685: @code{SetMainHello}.  These are the symbolic names describing the
  686: message.  To get the actual definitions which match the information in
  687: the catalog file we have to create the message catalog source file and
  688: process it using the @code{gencat} program.
  689: 
  690: @smallexample
  691: $ Messages for the famous greeting program.
  692: $quote "
  693: 
  694: $set Main
  695: Hello "Hallo, Welt!\n"
  696: @end smallexample
  697: 
  698: Now we can start building the program (assume the message catalog source
  699: file is named @file{hello.msg} and the program source file @file{hello.c}):
  700: 
  701: @smallexample
  702: @cartouche
  703: % gencat -H msgnrs.h -o hello.cat hello.msg
  704: % cat msgnrs.h
  705: #define MainSet 0x1     /* hello.msg:4 */
  706: #define MainHello 0x1   /* hello.msg:5 */
  707: % gcc -o hello hello.c -I.
  708: % cp hello.cat /usr/share/locale/de/LC_MESSAGES
  709: % echo $LC_ALL
  710: de
  711: % ./hello
  712: Hallo, Welt!
  713: %
  714: @end cartouche
  715: @end smallexample
  716: 
  717: The call of the @code{gencat} program creates the missing header file
  718: @file{msgnrs.h} as well as the message catalog binary.  The former is
  719: used in the compilation of @file{hello.c} while the later is placed in a
  720: directory in which the @code{catopen} function will try to locate it.
  721: Please check the @code{LC_ALL} environment variable and the default path
  722: for @code{catopen} presented in the description above.
  723: 
  724: 
  725: @node The Uniforum approach
  726: @section The Uniforum approach to Message Translation
  727: 
  728: Sun Microsystems tried to standardize a different approach to message
  729: translation in the Uniforum group.  There never was a real standard
  730: defined but still the interface was used in Sun's operation systems.
  731: Since this approach fits better in the development process of free
  732: software it is also used throughout the GNU project and the GNU
  733: @file{gettext} package provides support for this outside the GNU C
  734: Library.
  735: 
  736: The code of the @file{libintl} from GNU @file{gettext} is the same as
  737: the code in the GNU C Library.  So the documentation in the GNU
  738: @file{gettext} manual is also valid for the functionality here.  The
  739: following text will describe the library functions in detail.  But the
  740: numerous helper programs are not described in this manual.  Instead
  741: people should read the GNU @file{gettext} manual
  742: (@pxref{Top,,GNU gettext utilities,gettext,Native Language Support Library and Tools}).
  743: We will only give a short overview.
  744: