nls man page on NetBSD

nls man page on NetBSD

Man page or keyword search:
man Server 9087 pages
apropos Keyword Search (all sections)
Output format

NLS(7)		     BSD Miscellaneous Information Manual		NLS(7)

NAME
     NLS — Native Language Support Overview

DESCRIPTION
     Native Language Support (NLS) provides commands for a single worldwide
     operating system base.  An internationalized system has no built-in
     assumptions or dependencies on language-specific or cultural-specific
     conventions such as:

	   ·   Character classifications
	   ·   Character comparison rules
	   ·   Character collation order
	   ·   Numeric and monetary formatting
	   ·   Date and time formatting
	   ·   Message-text language
	   ·   Character sets

     All information pertaining to cultural conventions and language is
     obtained at program run time.

     “Internationalization” (often abbreviated “i18n”) refers to the operation
     by which system software is developed to support multiple cultural-spe‐
     cific and language-specific conventions.  This is a generalization
     process by which the system is untied from calling only English strings
     or other English-specific conventions.  “Localization” (often abbreviated
     “l10n”) refers to the operations by which the user environment is custom‐
     ized to handle its input and output appropriate for specific language and
     cultural conventions.  This is a specialization process, by which generic
     methods already implemented in an internationalized system are used in
     specific ways.  The formal description of cultural conventions for some
     country, together with all associated translations targeted to the native
     language, is called the “locale”.

     NetBSD provides extensive support to programmers and system developers to
     enable internationalized software to be developed.	 NetBSD also supplies
     a large variety of locales for system localization.

   Localization of Information
     All locale information is accessible to programs at run time so that data
     is processed and displayed correctly for specific cultural conventions
     and language.

     A locale is divided into categories.  A category is a group of language-
     specific and culture-specific conventions as outlined in the list above.
     ISO C specifies the following six standard categories supported by
     NetBSD:

     LC_COLLATE	    string-collation order information
     LC_CTYPE	    character classification, case conversion, and other char‐
		    acter attributes
     LC_MESSAGES    the format for affirmative and negative responses
     LC_MONETARY    rules and symbols for formatting monetary numeric informa‐
		    tion
     LC_NUMERIC	    rules and symbols for formatting nonmonetary numeric
		    information
     LC_TIME	    rules and symbols for formatting time and date information

     Localization of the system is achieved by setting appropriate values in
     environment variables to identify which locale should be used.  The envi‐
     ronment variables have the same names as their respective locale cate‐
     gories.  Additionally, the LANG, LC_ALL, and NLSPATH environment vari‐
     ables are used.  The NLSPATH environment variable specifies a colon-sepa‐
     rated list of directory names where the message catalog files of the NLS
     database are located.  The LC_ALL and LANG environment variables also
     determine the current locale.

     The values of these environment variables contains a string format as:

	     language[_territory][.codeset][@modifier]

     Valid values for the language field come from the ISO639 standard which
     defines two-character codes for many languages.  Some common language
     codes are:

     Language Name	Code	   Language Family
     ABKHAZIAN		AB	   IBERO-CAUCASIAN
     AFAN (OROMO)	OM	   HAMITIC
     AFAR		AA	   HAMITIC
     AFRIKAANS		AF	   GERMANIC
     ALBANIAN		SQ	   INDO-EUROPEAN (OTHER)
     AMHARIC		AM	   SEMITIC
     ARABIC		AR	   SEMITIC
     ARMENIAN		HY	   INDO-EUROPEAN (OTHER)
     ASSAMESE		AS	   INDIAN
     AYMARA		AY	   AMERINDIAN
     AZERBAIJANI	AZ	   TURKIC/ALTAIC
     BASHKIR		BA	   TURKIC/ALTAIC
     BASQUE		EU	   BASQUE
     BENGALI		BN	   INDIAN
     BHUTANI		DZ	   ASIAN
     BIHARI		BH	   INDIAN
     BISLAMA		BI
     BRETON		BR	   CELTIC
     BULGARIAN		BG	   SLAVIC
     BURMESE		MY	   ASIAN
     BYELORUSSIAN	BE	   SLAVIC
     CAMBODIAN		KM	   ASIAN
     CATALAN		CA	   ROMANCE
     CHINESE		ZH	   ASIAN
     CORSICAN		CO	   ROMANCE
     CROATIAN		HR	   SLAVIC
     CZECH		CS	   SLAVIC
     DANISH		DA	   GERMANIC
     DUTCH		NL	   GERMANIC
     ENGLISH		EN	   GERMANIC
     ESPERANTO		EO	   INTERNATIONAL AUX.
     ESTONIAN		ET	   FINNO-UGRIC
     FAROESE		FO	   GERMANIC
     FIJI		FJ	   OCEANIC/INDONESIAN
     FINNISH		FI	   FINNO-UGRIC
     FRENCH		FR	   ROMANCE
     FRISIAN		FY	   GERMANIC
     GALICIAN		GL	   ROMANCE
     GEORGIAN		KA	   IBERO-CAUCASIAN
     GERMAN		DE	   GERMANIC
     GREEK		EL	   LATIN/GREEK
     GREENLANDIC	KL	   ESKIMO
     GUARANI		GN	   AMERINDIAN
     GUJARATI		GU	   INDIAN
     HAUSA		HA	   NEGRO-AFRICAN
     HEBREW		HE	   SEMITIC
     HINDI		HI	   INDIAN
     HUNGARIAN		HU	   FINNO-UGRIC
     ICELANDIC		IS	   GERMANIC
     INDONESIAN		ID	   OCEANIC/INDONESIAN
     INTERLINGUA	IA	   INTERNATIONAL AUX.
     INTERLINGUE	IE	   INTERNATIONAL AUX.
     INUKTITUT		IU
     INUPIAK		IK	   ESKIMO
     IRISH		GA	   CELTIC
     ITALIAN		IT	   ROMANCE
     JAPANESE		JA	   ASIAN
     JAVANESE		JV	   OCEANIC/INDONESIAN
     KANNADA		KN	   DRAVIDIAN
     KASHMIRI		KS	   INDIAN
     KAZAKH		KK	   TURKIC/ALTAIC
     KINYARWANDA	RW	   NEGRO-AFRICAN
     KIRGHIZ		KY	   TURKIC/ALTAIC
     KURUNDI		RN	   NEGRO-AFRICAN
     KOREAN		KO	   ASIAN
     KURDISH		KU	   IRANIAN
     LAOTHIAN		LO	   ASIAN
     LATIN		LA	   LATIN/GREEK
     LATVIAN		LV	   BALTIC
     LINGALA		LN	   NEGRO-AFRICAN
     LITHUANIAN		LT	   BALTIC
     MACEDONIAN		MK	   SLAVIC
     MALAGASY		MG	   OCEANIC/INDONESIAN
     MALAY		MS	   OCEANIC/INDONESIAN
     MALAYALAM		ML	   DRAVIDIAN
     MALTESE		MT	   SEMITIC
     MAORI		MI	   OCEANIC/INDONESIAN
     MARATHI		MR	   INDIAN
     MOLDAVIAN		MO	   ROMANCE
     MONGOLIAN		MN
     NAURU		NA
     NEPALI		NE	   INDIAN
     NORWEGIAN		NO	   GERMANIC
     OCCITAN		OC	   ROMANCE
     ORIYA		OR	   INDIAN
     PASHTO		PS	   IRANIAN
     PERSIAN (farsi)	FA	   IRANIAN
     POLISH		PL	   SLAVIC
     PORTUGUESE		PT	   ROMANCE
     PUNJABI		PA	   INDIAN
     QUECHUA		QU	   AMERINDIAN
     RHAETO-ROMANCE	RM	   ROMANCE
     ROMANIAN		RO	   ROMANCE
     RUSSIAN		RU	   SLAVIC
     SAMOAN		SM	   OCEANIC/INDONESIAN
     SANGHO		SG	   NEGRO-AFRICAN
     SANSKRIT		SA	   INDIAN
     SCOTS GAELIC	GD	   CELTIC
     SERBIAN		SR	   SLAVIC
     SERBO-CROATIAN	SH	   SLAVIC
     SESOTHO		ST	   NEGRO-AFRICAN
     SETSWANA		TN	   NEGRO-AFRICAN
     SHONA		SN	   NEGRO-AFRICAN
     SINDHI		SD	   INDIAN
     SINGHALESE		SI	   INDIAN
     SISWATI		SS	   NEGRO-AFRICAN
     SLOVAK		SK	   SLAVIC
     SLOVENIAN		SL	   SLAVIC
     SOMALI		SO	   HAMITIC
     SPANISH		ES	   ROMANCE
     SUNDANESE		SU	   OCEANIC/INDONESIAN
     SWAHILI		SW	   NEGRO-AFRICAN
     SWEDISH		SV	   GERMANIC
     TAGALOG		TL	   OCEANIC/INDONESIAN
     TAJIK		TG	   IRANIAN
     TAMIL		TA	   DRAVIDIAN
     TATAR		TT	   TURKIC/ALTAIC
     TELUGU		TE	   DRAVIDIAN
     THAI		TH	   ASIAN
     TIBETAN		BO	   ASIAN
     TIGRINYA		TI	   SEMITIC
     TONGA		TO	   OCEANIC/INDONESIAN
     TSONGA		TS	   NEGRO-AFRICAN
     TURKISH		TR	   TURKIC/ALTAIC
     TURKMEN		TK	   TURKIC/ALTAIC
     TWI		TW	   NEGRO-AFRICAN
     UIGUR		UG
     UKRAINIAN		UK	   SLAVIC
     URDU		UR	   INDIAN
     UZBEK		UZ	   TURKIC/ALTAIC
     VIETNAMESE		VI	   ASIAN
     VOLAPUK		VO	   INTERNATIONAL AUX.
     WELSH		CY	   CELTIC
     WOLOF		WO	   NEGRO-AFRICAN
     XHOSA		XH	   NEGRO-AFRICAN
     YIDDISH		YI	   GERMANIC
     YORUBA		YO	   NEGRO-AFRICAN
     ZHUANG		ZA
     ZULU		ZU	   NEGRO-AFRICAN

     For example, the locale for the Danish language spoken in Denmark using
     the ISO 8859-1 character set is da_DK.ISO8859-1.  The da stands for the
     Danish language and the DK stands for Denmark.  The short form of da_DK
     is sufficient to indicate this locale.

     The environment variable settings are queried by their priority level in
     the following manner:

     ·	 If the LC_ALL environment variable is set, all six categories use the
	 locale it specifies.

     ·	 If the LC_ALL environment variable is not set, each individual cate‐
	 gory uses the locale specified by its corresponding environment vari‐
	 able.

     ·	 If the LC_ALL environment variable is not set, and a value for a par‐
	 ticular LC_* environment variable is not set, the value of the LANG
	 environment variable specifies the default locale for all categories.
	 Only the LANG environment variable should be set in /etc/profile,
	 since it makes it most easy for the user to override the system
	 default using the individual LC_* variables.

     ·	 If the LC_ALL environment variable is not set, a value for a particu‐
	 lar LC_* environment variable is not set, and the value of the LANG
	 environment variable is not set, the locale for that specific cate‐
	 gory defaults to the C locale.	 The C or POSIX locale assumes the
	 ASCII character set and defines information for the six categories.

   Character Sets
     A character is any symbol used for the organization, control, or repre‐
     sentation of data.	 A group of such symbols used to describe a particular
     language make up a character set.	It is the encoding values in a charac‐
     ter set that provide the interface between the system and its input and
     output devices.

     The following character sets are supported in NetBSD:

     ASCII	      The American Standard Code for Information Exchange
		      (ASCII) standard specifies 128 Roman characters and con‐
		      trol codes, encoded in a 7-bit character encoding
		      scheme.

     ISO 8859 family  Industry-standard character sets specified by the
		      ISO/IEC 8859 standard.  The standard is divided into 15
		      numbered parts, with each part specifying broad script
		      similarities.  Examples include Western European, Cen‐
		      tral European, Arabic, Cyrillic, Hebrew, Greek, and
		      Turkish.	The character sets use an 8-bit character
		      encoding scheme which is compatible with the ASCII char‐
		      acter set.

     Unicode	      The Unicode character set is the full set of known
		      abstract characters of all real-world scripts.  It can
		      be used in environments where multiple scripts must be
		      processed simultaneously.	 Unicode is compatible with
		      ISO 8859-1 (Western European) and ASCII.	Many character
		      encoding schemes are available for Unicode, including
		      UTF-8, UTF-16 and UTF-32.	 These encoding schemes are
		      multi-byte encodings.  The UTF-8 encoding scheme uses
		      8-bit, variable-width encodings which is compatible with
		      ASCII.  The UTF-16 encoding scheme uses 16-bit, vari‐
		      able-width encodings.  The UTF-32 encoding scheme using
		      32-bit, fixed-width encodings.

   Font Sets
     A font set contains the glyphs to be displayed on the screen for a corre‐
     sponding character in a character set.  A display must support a suitable
     font to display a character set.  If suitable fonts are available to the
     X server, then X clients can include support for different character
     sets.  xterm(1) includes support for Unicode with UTF-8 encoding.	xfd(1)
     is useful for displaying all the characters in an X font.

     The NetBSD wscons(4) console provides support for loading fonts using the
     wsfontload(8) utility.  Currently, only fonts for the ISO8859-1 family of
     character sets are supported.

   Internationalization for Programmers
     To facilitate translations of messages into various languages and to make
     the translated messages available to the program based on a user's
     locale, it is necessary to keep messages separate from the programs and
     provide them in the form of message catalogs that a program can access at
     run time.

     Access to locale information is provided through the setlocale(3) and
     nl_langinfo(3) interfaces.	 See their respective man pages for further
     information.

     Message source files containing application messages are created by the
     programmer and converted to message catalogs.  These catalogs are used by
     the application to retrieve and display messages, as needed.

     NetBSD supports two message catalog interfaces: the X/Open catgets(3)
     interface and the Uniforum gettext(3) interface.  The catgets(3) inter‐
     face has the advantage that it belongs to a standard which is well sup‐
     ported.  Unfortunately the interface is complicated to use and mainte‐
     nance of the catalogs is difficult.  The implementation also doesn't sup‐
     port different character sets.  The gettext(3) interface has not been
     standardized yet, however it is being supported by an increasing number
     of systems.  It also provides many additional tools which make program‐
     ming and catalog maintenance much easier.

   Support for Multi-byte Encodings
     Some character sets with multi-byte encodings may be difficult to decode,
     or may contain state (i.e., adjacent characters are dependent).  ISO C
     specifies a set of functions using 'wide characters' which can handle
     multi-byte encodings properly.  The behaviour of these functions is
     affected by the LC_CTYPE category of the current locale.

     A wide character is specified in ISO C as being a fixed number of bits
     wide and is stateless.  There are two types for wide characters: wchar_t
     and wint_t.  wchar_t is a type which can contain one wide character and
     operates like 'char' type does for one character.	wint_t can contain one
     wide character or WEOF (wide EOF).

     There are functions that operate on wchar_t, and substitute for functions
     operating on 'char'.  See wmemchr(3) and towlower(3) for details.	There
     are some additional functions that operate on wchar_t.  See wctype(3) and
     wctrans(3) for details.

     Wide characters should be used for all I/O processing which may rely on
     locale-specific strings.  The two primary issues requiring special use of
     wide characters are:

	   ·   All I/O is performed using multibyte characters.	 Input data is
	       converted into wide characters immediately after reading and
	       data for output is converted from wide characters to multi-byte
	       encoding immediately before writing.  Conversion is controlled
	       by the mbstowcs(3), mbsrtowcs(3), wcstombs(3), wcsrtombs(3),
	       mblen(3), mbrlen(3), and mbsinit(3).

	   ·   Wide characters are used directly for I/O, using getwchar(3),
	       fgetwc(3), getwc(3), ungetwc(3), fgetws(3), putwchar(3),
	       fputwc(3), putwc(3), and fputws(3).  They are also used for
	       formatted I/O functions for wide characters such as fwscanf(3),
	       wscanf(3), swscanf(3), fwprintf(3), wprintf(3), swprintf(3),
	       vfwprintf(3), vwprintf(3), and vswprintf(3), and wide character
	       identifier of %lc, %C, %ls, %S for conventional formatted I/O
	       functions.

SEE ALSO
     gencat(1), xfd(1), xterm(1), catgets(3), gettext(3), nl_langinfo(3),
     setlocale(3), wsfontload(8)

BUGS
     This man page is incomplete.

BSD			       February 21, 2007			   BSD

[top]

                             _         _         _ 
                            | |       | |       | |     
                            | |       | |       | |     
                         __ | | __ __ | | __ __ | | __  
                         \ \| |/ / \ \| |/ / \ \| |/ /  
                          \ \ / /   \ \ / /   \ \ / /   
                           \   /     \   /     \   /    
                            \_/       \_/       \_/

More information is available in HTML format for server NetBSD

List of man pages available for NetBSD

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]

Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................

Vote for polarhome