colltbl man page on IRIX

Man page or keyword search:  
man Server   31559 pages
apropos Keyword Search (all sections)
Output format
IRIX logo
[printable version]



colltbl(1M)							   colltbl(1M)

NAME
     colltbl - create collation database

SYNOPSIS
     colltbl [ file | - ]

DESCRIPTION
     The colltbl command takes as input a specification file, file, that
     describes the collating sequence for a particular language and creates a
     database that can be read by strxfrm(3C) and strcoll(3C).	strxfrm(3C)
     transforms its first argument and places the result in its second
     argument.	The transformed string is such that it can be correctly
     ordered with other transformed strings by using strcmp(3C), strncmp(3C),
     or memcmp(3C).  strcoll(3C) transforms its arguments and does a
     comparison.

     If no input file is supplied, stdin is read.

     The output file produced contains the database with collating sequence
     information in a form usable by system commands and routines.  The name
     of this output file is the value you assign to the keyword codeset read
     in from file.  Before this file can be used, it must be installed in the
     /usr/lib/locale/locale directory with the name LC_COLLATE by someone who
     is super-user or a member of group bin.  locale corresponds to the
     language area whose collation sequence is described in file.  This file
     must be readable by user, group, and other; no other permissions should
     be set.  To use the collating sequence information in this file, set the
     LC_COLLATE environment variable appropriately [see environ(5) or
     setlocale(3C)].

     The colltbl command can support languages whose collating sequence can be
     completely described by the following cases:

     o	 Ordering of single characters within the code set.  For example, in
	 Swedish, V is sorted after U, before X, and with W (V and W are
	 considered identical as far as sorting is concerned).

     o	 Ordering of ``double characters'' in the collation sequence.  For
	 example, in Spanish, ch and ll are collated after c and l,
	 respectively.

     o	 Ordering of a single character as if it consists of two characters.
	 For example, in German, the ``sharp ,,is sorted as ss.	 This is
	 a special instance of the next case below.

     o	 Substitution of one character string with another character string.
	 In the example above, the strisgreplaced with ss during
	 sorting.

     o	 Ignoring certain characters in the code set during collation.	For
	 example, if - were ignored during collation, then the strings
	 re-locate and relocate would compare as equal.

									Page 1

colltbl(1M)							   colltbl(1M)

     o	 Secondary ordering between characters.	 In the case where two
	 characters are sorted together in the collation sequence, (that is,
	 they have the same "primary" ordering), there is sometimes a
	 secondary ordering that is used if two strings are identical except
	 for characters that have the same primary ordering.  For example, in
	 French, the letters e and ` have the same primary ordering but e
	 comes before ` in the secondary ordering.  Thus the word lever would
	 be ordered before l`ver, but l`ver would be sorted before levitate.
	 (Note that if e came before ` in the primary ordering, then l`ver
	 would be sorted after levitate.)

     The specification file consists of three types of statements:

     1.	 codeset   filename

	 filename is the name of the output file to be created by colltbl.

     2.	 order is  order_list

	 order_list is a list of symbols, separated by semicolons, that
	 defines the collating sequence.  The special symbol, ..., specifies
	 symbols that are lexically sequential in a short-hand form.  For
	 example,
	      order is	a;b;c;d;...;x;y;z

	 would specify the list of lowercase letters.  Of course, this could
	 be further compressed to just a;...;z.

	 A symbol can be up to two bytes in length and can be represented in
	 any one of the following ways:

	 o   the symbol itself (for example, a for the lowercase letter a),

	 o   in octal representation (for example, \141 or 0141 for the letter
	     a), or

	 o   in hexadecimal representation (for example, \x61 or 0x61 for the
	     letter a).

	 Any combination of these may be used as well.

	 The backslash character, \ , is used for continuation.	 No characters
	 are permitted after the backslash character.

	 Symbols enclosed in parentheses are assigned the same primary
	 ordering but different secondary ordering.  Symbols enclosed in curly
	 brackets are assigned only the same primary ordering.	For example,

	      order is	a;b;c;ch;d;(e;`);f;...;z;\
			{1;...;9};A;...;Z

									Page 2

colltbl(1M)							   colltbl(1M)

	 In the above example, e and ` are assigned the same primary ordering
	 and different secondary ordering, digits 1 through 9 are assigned the
	 same primary ordering and no secondary ordering.  Only primary
	 ordering is assigned to the remaining symbols.	 Notice how double
	 letters can be specified in the collating sequence (letter ch comes
	 between c and d).

	 If a character is not included in the order is statement, it is
	 excluded from the ordering and will be ignored during sorting.

     3.	 substitute string with repl

	 The substitute statement substitutes the string string with the
	 string repl.  This can be used, for example, to provide rules to sort
	 the abbreviated month names numerically:

	      substitute "Jan" with "01"
	      substitute "Feb" with "02"
		   .
		   .
		   .
	      substitute "Dec" with "12"

	 A simpler use of the substitute statement would be to substitute a
	 single character with two characters, as with the substitution Bf _
	 with ss in German.

     The substitute statement is optional.  The order is and codeset
     statements must appear in the specification file.

     Any lines in the specification file with a # in the first column are
     treated as comments and are ignored.  Empty lines are also ignored.

EXAMPLE
     The following example shows the collation specification required to
     support a hypothetical telephone book sorting sequence.

     The sorting sequence is defined by the following rules:

     a.Upper- and lowercase letters must be sorted together, but uppercase
       letters have precedence over lowercase letters.

     b.All special characters and punctuation should be ignored.

     c.Digits must be sorted as their alphabetic counterparts (for example, 0
       as zero, 1 as one).

     d.The Ch, ch, CH combinations must be collated between C and D.

									Page 3

colltbl(1M)							   colltbl(1M)

     e.V and W, v and w must be collated together.

     The input specification file to colltbl will contain:

	       codeset	 telephone
	       order is	 A;a;B;b;C;c;CH;Ch;ch;D;d;E;e;F;f;\
			 G;g;H;h:I;i;J;j;K;k;L;l;M;m;N;n;O;o;P;p;\
			 Q;q;R;r;S;s;T;t;U;u;{V;W};{v;w};X;x;Y;y;Z;z
	       substitute "0" with "zero"
	       substitute "1" with "one"
	       substitute "2" with "two"
	       substitute "3" with "three"
	       substitute "4" with "four"
	       substitute "5" with "five"
	       substitute "6" with "six"
	       substitute "7" with "seven"
	       substitute "8" with "eight"
	       substitute "9" with "nine"

FILES
     /lib/locale/locale/LC_COLLATE
		     LC_COLLATE database for locale

     /usr/lib/locale/C/colltbl_C
		     input file used to construct LC_COLLATE in the default
		     locale.

SEE ALSO
     memory(3C), setlocale(3C), strcoll(3C), string(3C), strxfrm(3C),
     environ(5)

									Page 4

[top]

List of man pages available for IRIX

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net