i18n_intro man page on DigitalUNIX

Man page or keyword search:  
man Server   12896 pages
apropos Keyword Search (all sections)
Output format
DigitalUNIX logo
[printable version]

i18n_intro(5)							 i18n_intro(5)

       i18n_intro,  i18n,  LANG,  LC_ALL,  LC_COLLATE,	LC_CTYPE, LC_MESSAGES,
       LC_MONETARY, LC_NUMERIC, LC_TIME - Introduction to internationalization

       Internationalization refers to the process of developing programs with‐
       out prior knowledge of the language, cultural data, or character-encod‐
       ing  schemes  that the programs are expected to handle. In other words,
       internationalization refers to the availability and use	of  interfaces
       that  let programs modify their behavior at run time for operation in a
       specific language environment.  The abbreviation I18N is often used  to
       stand  for internationalization, as there are 18 characters between the
       beginning "I" and the ending "N" of that word.

       The I18N interfaces and utilities provided with	the  operating	system
       conform to Issue 4 of X/Open CAE specifications.

       A concept related to internationalization is localization (L10N), which
       refers to the process of establishing  information  within  a  computer
       system  for  each  combination  of  native language, cultural data, and
       coded character set (codeset). A locale is  a  database	that  provides
       information  for	 a  unique combination of these three components. How‐
       ever, locales do not solve all of the problems that  localization  must
       address.	 Many  native languages require additional support in the form
       of language-specific print filters, fonts, codeset converters,  charac‐
       ter input methods, and other kinds of specialized software.

       See  the following reference pages for additional introductory informa‐
       tion on topics related to internationalization: For more information on
       localization  and locales For an introduction to codeset conversion For
       a summary of printer support for native languages

   Characters, Character Sets, and Codesets
       A character is a member of a set of elements used for the organization,
       control, or representation of data.

       A character set is a set of alphabetic or other characters used to con‐
       struct the words and other elementary units of  a  native  language  or
       computer	 language.  A character set specifies only the characters that
       are included in the set.	 ASCII, CNS 11643 and DTSCS  are  examples  of
       character sets.

       A coded character set (codeset) is a set of unambiguous rules that sup‐
       port one or more character sets and establishes	the  one-to-one	 rela‐
       tionship	 between  each	character and its bit representation. In other
       words, a codeset consists of the code points for characters in  one  or
       more character sets. For example, DEC Hanyu (dechanyu) is a codeset for
       Chinese and contains code points	 for  characters  in  the  ASCII,  CNS
       11643-1986 (plane 1 and plane 2), and DTSCS character sets.

   Language Announcement (Setting Locale)
       Language	 announcement  is  the	mechanism  by which language, cultural
       data, and codeset requirements are set either for the system as a whole
       or by individual users. An application can also set these requirements,
       although it is more common for an internationalized application to  use
       the setting in effect for the user who runs the program. See the System
       Administration manual for information about setting systemwide defaults
       for shells. See setlocale(3) and Writing Software for the International
       Market for information on how applications query or set locale require‐
       ments at run time.

       Language	 announcement  is  performed  by  setting one or more reserved
       environment variables to the name of an installed locale.  Each	locale
       has  associated	with  it  collating  sequences,	 character  conversion
       tables, character classification tables, formats for different kinds of
       data,  and message catalogs. If the same locale meets user requirements
       in all these categories, set only the LANG environment variable to  the
       locale name. A locale name usually has the following format:


       Where  language	represents the human language of the locale, territory
       represents a geographic country or region, codeset is the coded charac‐
       ter  set	 used  in the locale, and the optional @modifier suffix repre‐
       sents additional information for localization of data.

       The following Korn shell example sets LANG to a locale  supporting  the
       English language, United States cultural data, and ISO8859-1 codeset: $

       The following C shell example sets LANG to a locale supporting the Tra‐
       ditional	 Chinese  language, Hong Kong cultural data, and the DEC Hanyu
       codeset: % setenv LANG zh_HK.dechanyu

       Locale name formats can vary from vendor to vendor. Use the  locale  -a
       command	to display the names of locales installed on your system.  See
       l10n_intro(5) for a list of the locales provided with  the  Tru64  UNIX

       An alternative way to set locale requirements for all locale categories
       is to set the LC_ALL environment variable. The difference  between  the
       LANG  and LC_ALL variables is that LC_ALL is a high-precedence variable
       that overrides all other locale variables,  including  LANG.  The  LANG
       variable,  on  the other hand, is a low-precedence variable.  When used
       by itself, the LANG variable implicitly sets all locale	categories  to
       the  specified  locale  just as LC_ALL does. However, the LANG variable
       can be used together with variables for specific locale	categories  to
       create  a  multilocale environment.  The category-specific locale vari‐
       ables and what they control follow: String collation Character  classi‐
       fication Translations for messages and valid strings for "yes" and "no"
       responses The currency symbol and the format  of	 monetary  values  The
       format of numeric values The format of date and time values

	      A locale can support only one set of date and time formats; how‐
	      ever, there can be several sets of date and time formats in  use
	      for  a  particular language and territory. See l10n_intro(5) for
	      information about creating a site-specific version of  a	locale
	      to  support date and time formats different from those supported
	      by an installed locale.

       The operating system provides dense code locales and  Unicode  locales.
       Unicode locales are installed in /usr/i18n/lib/nls/ucsloc/.  Dense code
       locales are installed in /usr/i18n/lib/nls/loc/.	 The  Unicode  locales
       enable consistent wchar_t values across locales and platform interoper‐
       ability. The system administrator, as root, can define  the  systemwide
       default	as  Unicode locales or dense code locales by changing the sym‐
       bolic link /usr/i18n/lib/nls/dloc/ from to  l10n_intro(5)  for  a  more
       information  on	the  Unicode locales and switching between Unicode and
       dense code. See Unicode(5) for more information about UCS-4  and	 UTF-8

       Unicode	locales,  with	a  UTF-8  suffix,  use	UTF-32 as the internal
       process code and UTF-8 as the file format.

       The operating system also includes a complete set of non-UTF-8  Unicode
       locales	in  /usr/i18n/lib/nls/ucsloc/  that  provide  UTF-32  internal
       process code for applications that require file code in the  format  of
       the traditional UNIX or a proprietary codeset.

       A  @modifier  suffix indicates locale variants that support alternative
       rules for collation in Asian languages.	Use locales  with  these  suf‐
       fixes  only when setting LC_COLLATE.  For example, three different sets
       of collation rules (chuyin, radical, and stroke) can be used  with  the
       locale  supporting  the	Chinese language, Taiwanese cultural data, and
       the Taiwanese EUC codeset. If Korn shell users want to use this locale,
       they  might  make  the following settings: $ LANG=zh_TW.eucTW $ LC_COL‐

       The preceding example implicitly sets all locale category variables  to
       zh_TW.eucTW,  except  for  the  LC_COLLATE  variable,  which  is set to
       zh_TW.eucTW@stroke. The following locale command displays the  variable
       settings after these assignments:

       $       locale	   LANG=zh_TW.eucTW	 LC_COLLATE=zh_TW.eucTW@stroke
       LC_CTYPE="zh_TW.eucTW"			     LC_MONETARY="zh_TW.eucTW"
       LC_NUMERIC="zh_TW.eucTW"		 LC_TIME="zh_TW.eucTW"	       LC_MES‐
       SAGES="zh_TW.eucTW" LC_ALL=

       Commands: locale(1), setlocale(3)

       Others: i18n_printing(5), iconv_intro(5), l10n_intro(5), Unicode(5)

       Writing Software for the International Market

       Using International Software

       System Administration


List of man pages available for DigitalUNIX

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
Vote for polarhome
Free Shell Accounts :: the biggest list on the net