sort man page on DigitalUNIX

Man page or keyword search:  
man Server   12896 pages
apropos Keyword Search (all sections)
Output format
DigitalUNIX logo
[printable version]

sort(1)								       sort(1)

       sort - Sorts or merges files

       sort  [-m]  [-o	output_file] [-Abdfinru] [-k keydef]... [-t character]
       [-T directory] [-y] [kilobytes] [-z record_size]... file...

       sort -c	[-u] [-Abdfinru] [-k keydef]... [-t character] [-T  directory]
       [-y] [kilobytes] [-z record_size]... file...

       The  following  older syntax is now maintained for backward compatibil‐
       ity, but may be withdrawn in future issues: sort [-Abcdfimnru] [-o out‐
       put_file]    [-t character]    [-T    directory]	   [-y]	   [kilobytes]
       [-z record_size]	 [+fskip]  [.cskip]  [-fskip]  [.cskip]	  [-bdfinr]...

       Interfaces  documented on this reference page conform to industry stan‐
       dards as follows:

       sort:  XCU5.0

       Refer to the standards(5) reference page	 for  more  information	 about
       industry standards and associated tags.

       The -d, -f, -i, -n, and -r options override the default ordering rules.
       When ordering options appear independent of any	key  field  specifica‐
       tions,  the  requested field ordering rules are applied globally to all
       sort keys.  When attached to a specific key  (see  -k),	the  specified
       ordering options override all global ordering options for that key.  In
       the obsolescent forms, if one or more of these options follows a +fskip
       option,	it  affects  only  the	key  field specified by that preceding
       option.	[Tru64 UNIX]  Sorts on a byte-by-byte basis using each charac‐
       ter's encoded value.  On some systems, extended characters will be con‐
       sidered negative values, and so sort before ASCII characters.   If  you
       are  sorting ASCII characters in a non-C/POSIX locale, this option per‐
       forms much faster.  Ignores leading spaces and  tabs  when  determining
       the  starting and ending positions of a restricted sort key.  If the -b
       option is specified before the  first  -k  option,  the	-b  option  is
       applied to all -k options on the command line; otherwise, the -b option
       can be independently attached to each -k field_start or field_end argu‐
       ment.   Checks that the input is sorted according to the ordering rules
       specified in the options and the	 collating  sequence  of  the  current
       locale.	No output is produced; only the exit code is affected.	Speci‐
       fies that only spaces and alphanumeric  characters  (according  to  the
       current setting of LC_TYPE) are significant in comparisons.  Treats all
       lowercase characters as their uppercase equivalents (according  to  the
       current setting of LC_TYPE) for the purposes of comparison.  Sorts only
       by printable characters (according to the current setting of  LC_TYPE).
       Specifies one or more (up to 50) restricted sort key field definitions.
       This option replaces  the  obsolescent  +fskip.cskip  and  -fskip.cskip
       options. A field comprises a maximal sequence of non-separating charac‐
       ters and, in the absence of the -t option, any preceding field  separa‐

	      The   format   of	  a   key  field  definition  is  as  follows:

	      The field_start and field_end arguments define a key field  that
	      is  restricted  to a portion of the line, and type is a modifier
	      specified by b, d, f, i, n, r, or t.   The  b  modifier  behaves
	      like  the	 -b  option,  but  applies  only to the field_start or
	      field_end argument to which it  is  attached.   The  t  modifier
	      indicates that the key field is processed as CPU time. The other
	      modifiers behave like their  corresponding  options,  but	 apply
	      only  to	the  key field to which they are attached; these modi‐
	      fiers have this effect if specified with field_start,  field_end
	      or both.

	      Modifiers	 attached to a field_start or field_end argument over‐
	      ride  any	 specifications	 made  by  the	options.   A   missing
	      field_end	 argument  means the last character of the line.  When
	      multiple sort keys are specified, it is advisable to  specify  a
	      field_end argument to avoid possible confusion.

	      The field_start portion of the keydef argument takes the follow‐
	      ing form: field_number[.first_character]

	      Fields and characters within fields are numbered	starting  with
	      1.  The  field_number and first_character pieces, interpreted as
	      positive decimal integers, specify the character to be  used  as
	      part  of	a  sort key.  If first_character is not specified, the
	      default is the first character of the field.

	      The field_end portion of the keydef argument takes the following
	      form: field_number[.last_character]

	      The  field_number	 syntax	 is  the  same	as  that described for
	      field_start.  The last_character argument, interpreted as a non‐
	      negative	decimal	 integer,  specifies  the last character to be
	      used as part of the sort key.  If last_character evaluates to  0
	      (zero) or is not specified, the default is the last character of
	      the field specified by field_number.

	      If -b is in effect, characters within a field are	 counted  from
	      the  first nonspace character in the field.  (This applies sepa‐
	      rately to first_character and last_character.)

	      If -k is not specified, the default sort key is the entire line.

	      When there are multiple key fields, later keys are compared only
	      after  all  earlier  keys	 compare as equal.  Except when the -u
	      option is specified, lines that otherwise compare as  equal  are
	      ordered as though none of the options -d, -f, -i, -n, or -k were
	      present (but with -r still in effect, if it was  specified)  and
	      with all bytes in the lines significant to the comparison.

	      The algorithm for the -k option can be summarized as follows:

	       * -ka.b,c.d = if d==0 then +(a-1).(b-1) -c.d
	       *	      else +(a-1).(b-1) -(c-1).d
	       */  Merges  only	 (assumes  sorted  input).   Sorts any initial
	      numeric strings (including  regular  expressions	consisting  of
	      optional	spaces,	 optional  dashes, and zero (0) or more digits
	      with  optional  radix  character	and  thousands	separator,  as
	      defined  by  the	current locale) by arithmetic value.  An empty
	      digit string is treated as zero;	leading	 zeros	and  signs  on
	      zeros  do	 not affect ordering.  Only one period (.) can be used
	      in numeric strings.  All subsequent periods (.) and any  charac‐
	      ter  to  the  right  of the period (.) will be ignored.  Directs
	      output to output_file instead  of	 standard  output.   The  out‐
	      put_file	can  be	 the same as one of the input files.  Reverses
	      the order of the specified sort.	Sets the field separator char‐
	      acter  to character. The character argument is not considered to
	      be part of a field (although it can be included in a sort	 key).
	      Each  occurrence	of  character is significant (for example, two
	      consecutive occurrences of character delimit  an	empty  field).
	      To  specify  the	tab character as the field separator, you must
	      enclose it in ' ' (single quotes).

	      The default field separator  is  one  or	more  spaces.	[Tru64
	      UNIX]  Places all the temporary files that are created in direc‐
	      tory.  Suppresses all but one in each set of  equal  lines  (for
	      example,	lines whose sort keys match exactly).  Ignored charac‐
	      ters such as leading tabs and spaces, and characters outside  of
	      sort keys are not considered in this type of comparison.

	      If  used	with  the -c option, -u checks that there are no lines
	      with duplicate keys, in addition to checking that the input file
	      is  sorted.   [Tru64  UNIX]  Starts the sort command using kilo‐
	      bytes of main storage and adds storage as needed.	 (If kilobytes
	      is  less than the minimum storage size or greater than the maxi‐
	      mum, the minimum or maximum is used instead.)  If the -y	option
	      is  omitted,  the	 sort  command starts with the default storage
	      size; -y 0 starts with minimum storage, and -y (with  no	value)
	      starts  with the maximum storage.	 The amount of storage used by
	      the sort command has a significant impact on performance.	 Sort‐
	      ing a small file in a large amount of storage is wasteful.  Pre‐
	      vents abnormal termination if lines being sorted are longer than
	      the  default  buffer size can handle.  When the -c or -m options
	      are specified, the sorting phase is omitted and a system default
	      size buffer is used.  If sorted lines are longer than this size,
	      sort terminates abnormally.  The -z option  specifies  that  the
	      longest line be recorded in the sort phase so that adequate buf‐
	      fers can be allocated in the merge phase.	 The record_size argu‐
	      ment  must be a value in bytes equal to or greater than the num‐
	      ber of bytes in the longest line to be  merged.	Specifies  the
	      start position of a key field.  See the -k option for a descrip‐
	      tion of the current way to perform  this	operation.   (Obsoles‐

	      The  fskip  variable specifies the number of fields to skip from
	      the beginning of the input line, and the cskip  variable	speci‐
	      fies  the	 number	 of additional characters to skip to the right
	      beyond that point.  For both the starting	 point	(+fskip.cskip)
	      and the ending point (-fskip.cskip) of a sort key, fskip is mea‐
	      sured from the beginning of the input line, and  cskip  is  mea‐
	      sured from the last field skipped.  If you omit assumed.	If you
	      omit fskip, 0 (zero) is assumed.	If you omit the	 ending	 field
	      specifier	 (-fskip.cskip), the end of the line is the end of the
	      sort key.

	      You can supply more than one sort key by repeating  +fskip.cskip
	      and -fskip.cskip.	 In cases where you specify more than one sort
	      key, keys specified further to the right on the command line are
	      compared	only  after all earlier keys are sorted.  For example,
	      if the first key is to be sorted in numerical order and the sec‐
	      ond  according to the collating sequence, all strings that start
	      with the number 1 are sorted according to	 the  collating	 order
	      before the strings that start with the number 2.	Lines that are
	      identical in all keys are sorted with  all  characters  signifi‐
	      cant.  You can also specify different options for different sort
	      keys in multiple sort keys.  Specifies the end position of a key
	      field.   See  the -k option for a description of the current way
	      to perform this operation.  (Obsolescent)

       The sort command sorts lines in its input files and writes  the	result
       to standard output.

       The  sort  command performs one of the following functions: Sorts lines
       of all the named files together and writes the result to the  specified
       output.	 Merges	 lines of all the named (presorted) files together and
       writes the result to the specified output.  Checks that a single	 input
       file is correctly presorted.

       Comparisons are based on one or more sort keys extracted from each line
       of input (or the entire line if no sort keys are	 specified),  and  are
       performed using the collating sequence of the current locale.

       The sort command treats all of its input files as one file when it per‐
       forms the sort.	A - (dash) in place of a file name specifies  standard
       input.  If you do not specify a file name, it sorts standard input.

       The sort command can handle a variety of collation rules typically used
       in Western European  languages,	including  primary/secondary  sorting,
       one-to-two  character  mapping, N-to-one character mapping, and ignore-
       character mapping.  To summarize briefly:

   Primary/Secondary Sorting
       In this system, a group of characters all  sort	to  the	 same  primary
       location.   If  there is a tie, a secondary sort is applied.  For exam‐
       ple, in French, the plain and accented a's all sort to the same primary
       location.   If  two  strings  collate to the same primary location, the
       secondary sort goes into effect.	 These words  are  in  correct	French

       abord pre aprs pret azur

   One-to-Two Character Mappings
       This  system  requires  that certain single characters be treated as if
       they were two characters.  For example, in German, the  (scharfes-S) is
       collated as if it were ss.

   N-to-One Character Mappings
       Some  languages	treat  a string of characters as if it were one single
       collating element.  For example, in Spanish, the ch  and	 ll  sequences
       are  treated  as	 their	own  elements  within the alphabet.  (ch comes
       between c and d in the alphabet, and ll comes between l and m.)

   Ignore-Character Mappings
       In some cases, certain characters may be	 ignored  in  collation.   For
       example,	 if  -	were  defined  as an ignore-character, the strings re-
       locate and relocate would sort to the same place. The results that  you
       get  from  sort depend on the collating sequence as defined by the cur‐
       rent setting of the LC_COLLATE environment variable.  The configuration
       files  for  collation  and  character  classification  information  are
       /usr/lib/nls/loc/src/locale.src. A field	 is  one  or  more  characters
       bounded	by the beginning of a line and the current field separator, or
       one or more characters bounded by a field  separator  on	 either	 side.
       The  space  character is the default field separator. Lines longer than
       1024 bytes are truncated by sort.  The maximum number of	 fields	 on  a
       line is 50.

       The  sort  command  returns  the following exit values: All input files
       were output successfully, or -c was specified and the  input  file  was
       correctly  sorted.   Under  the	-c option, the file was not ordered as
       specified, or if the -c and -u options were both specified,  two	 input
       lines were found with equal keys.  An error occurred.

       The following examples apply to the C locale, unless it is specifically
       stated otherwise.  To perform a simple sort, enter: sort fruits

	      This displays the contents of fruits sorted in ascending lexico‐
	      graphic  order.	This  means that the characters in each column
	      are compared one by one, including spaces, digits,  and  special

	      For instance, if fruits contains the text:

	      banana orange Persimmon apple %%banana apple ORANGE

	      Then sort fruits displays: %%banana ORANGE Persimmon apple apple
	      banana orange

	      This order follows from the fact that  in	 the  ASCII  collating
	      sequence, symbols (such as %) precede uppercase letters, and all
	      uppercase letters precede the  lowercase	letters.  If  you  are
	      using  a	different collating order, your results may be differ‐
	      ent.  To group lines that contain uppercase and special  charac‐
	      ters  with  similar lowercase lines, and remove duplicate lines,
	      enter: sort -d -f -u fruits

	      The -u option tells sort to remove duplicate lines, making  each
	      line  of	the file unique.  This displays: apple %%banana orange

	      Not only was the duplicate apple removed, but banana and	ORANGE
	      were removed as well. The -d option told sort to ignore symbols,
	      so %%banana and banana were considered to be duplicate lines and
	      banana  was removed.  The -f option told sort not to differenti‐
	      ate between uppercase and lowercase, so ORANGE and  orange  were
	      considered to be duplicate lines and ORANGE was removed.

	      When the -u option is used with input that contains nonidentical
	      lines that are considered by sort (due to other options)	to  be
	      duplicates,  there  is  no  way to predict which lines sort will
	      keep and which it will remove.  To sort as  in  Example  2,  but
	      remove  duplicates unless capitalized or punctuated differently,
	      enter: sort -u -k 1df -k 1 fruits

	      Options appearing between sort key specifiers apply only to  the
	      specifier preceding them.	 There are two sorts specified in this
	      command line. The -k 1df argument specifies the first  sort,  of
	      the  same type done with -d -f in Example 3.  Then -k 1 performs
	      another comparison to distinguish lines that  are	 not  actually
	      identical.   This	 prevents  -u,	which  applies	to  both sorts
	      because it precedes the first sort key specifier, from  removing
	      lines that are not exactly identical to other lines.

	      Given the fruits file shown in Example 1, the added -k 1 distin‐
	      guishes %%banana from banana and ORANGE  from  orange.  However,
	      the two instances of apple are exactly identical, so one of them
	      is deleted.  apple %%banana banana ORANGE	 orange	 Persimmon  To
	      specify a new field separator, enter: sort -t : -k 2 vegetables

	      This sorts vegetables, comparing the text that follows the first
	      colon on each line.  The -t : option tells sort that colons sep‐
	      arate  fields.  The -k 2 argument tells sort to ignore the first
	      field and to compare from the start of the second field  to  the
	      end of the line.	If vegetables contains:

	      yams:104	 turnips:8   potatoes:15  carrots:104  green  beans:32
	      radishes:5 lettuce:15

	      then sort -t : -k 2 vegetables  displays:	 carrots:104  yams:104
	      lettuce:15 potatoes:15 green beans:32 radishes:5 turnips:8

	      The  numbers are not in ascending order. This is because a lexi‐
	      cographic sort compares each character from left to  right.   In
	      other  words, 3 comes before 5 so 32 comes before 5.  To sort on
	      more than one field, enter: sort -t : -k 2n -k 1r vegetables

	      This performs a numeric sort on the second  field	 (-k  2n)  and
	      then,  within  that  ordering,  sorts the first field in reverse
	      collating order (-k 1r).	The output looks like this: radishes:5
	      turnips:8	 potatoes:15  lettuce:15  green beans:32 yams:104 car‐

	      The lines are sorted in numeric order; when two lines  have  the
	      same number, they appear in reverse collating order.  To replace
	      the original file with the sorted text, enter: sort  -o  vegeta‐
	      bles vegetables

	      The  -o vegetables option stores the sorted output into the file
	      vegetables.  To collate using Spanish rules, set the  LC_COLLATE
	      (or LANG) environment variable to a Spanish locale, and then use
	      sort in the regular way, enter: sort sp.words

	      If an input file named sp.words contains the  following  Spanish

	      dama loro chapa canto mover chocolate curioso llanura

	      The  sorted  file looks like this: canto curioso chapa chocolate
	      dama loro llanura mover

	      If you sort the file in the default C locale, the	 output	 looks
	      like this: canto chapa chocolate curioso dama llanura loro mover

       The  following environment variables affect the execution of sort: Pro‐
       vides a default value for the internationalization variables  that  are
       unset  or  null. If LANG is unset or null, the corresponding value from
       the default locale is used.  If any of the  internationalization	 vari‐
       ables contain an invalid setting, the utility behaves as if none of the
       variables had been defined.  If set to a non-empty string value,	 over‐
       rides  the  values  of  all  the	 other internationalization variables.
       Determines the locale for the interpretation of sequences of  bytes  of
       text  data as characters (for example, single-byte as opposed to multi‐
       byte characters in arguments) and the behavior of character classifica‐
       tion for the -b, -d, -f, -i, and -n options.  Determines the locale for
       the format and contents of  diagnostic  messages	 written  to  standard
       error.	Determines the location of message catalogues for the process‐
       ing of LC_MESSAGES.

       Configuration files

       Commands:  comm(1), join(1), uniq(1)

       Functions:  setlocale(3), tolower(3)

       Files:  locale(4)

       Standards:  standards(5)


List of man pages available for DigitalUNIX

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
Vote for polarhome
Free Shell Accounts :: the biggest list on the net