nawk man page on SunOS

Man page or keyword search:  
man Server   20652 pages
apropos Keyword Search (all sections)
Output format
SunOS logo
[printable version]

nawk(1)				 User Commands			       nawk(1)

NAME
       nawk - pattern scanning and processing language

SYNOPSIS
       /usr/bin/nawk   [-F ERE]	 [-v assignment]  'program'  |	-f progfile...
       [argument...]

       /usr/xpg4/bin/awk  [-F ERE]  [-v assignment...]	'program'  |  -f prog‐
       file... [argument...]

DESCRIPTION
       The  /usr/bin/nawk  and	/usr/xpg4/bin/awk  utilities  execute programs
       written in the nawk programming language, which is specialized for tex‐
       tual  data  manipulation.  A nawk program is a sequence of patterns and
       corresponding actions. The string specifying program must  be  enclosed
       in  single  quotes  (') to protect it from interpretation by the shell.
       The sequence of pattern - action statements can	be  specified  in  the
       command line as program or in one, or more, file(s) specified by the -f
       progfile option. When input is read that matches a pattern, the	action
       associated with the pattern is performed.

       Input  is interpreted as a sequence of records. By default, a record is
       a line, but this can be changed by using the RS built-in variable. Each
       record  of  input  is  matched to each pattern in the program. For each
       pattern matched, the associated action is executed.

       The nawk utility interprets each input record as a sequence  of	fields
       where,  by  default,  a field is a string of non-blank characters. This
       default white-space field delimiter (blanks and/or tabs) can be changed
       by  using the FS built-in variable or the -F ERE option. The nawk util‐
       ity denotes the first field in a record	$1,  the  second  $2,  and  so
       forth.  The  symbol  $0	refers to the entire record; setting any other
       field causes the reevaluation of $0. Assigning to $0 resets the	values
       of all fields and the NF built-in variable.

OPTIONS
       The following options are supported:

       -F ERE	       Define  the  input  field  separator to be the extended
		       regular expression ERE, before any input is  read  (can
		       be a character).

       -f progfile     Specifies  the pathname of the file progfile containing
		       a nawk program. If multiple instances  of  this	option
		       are specified, the concatenation of the files specified
		       as progfile in the order specified is the nawk program.
		       The  nawk program can alternatively be specified in the
		       command line as a single argument.

       -v assignment   The assignment argument must be in the same form as  an
		       assignment  operand.  The  assignment  is  of  the form
		       var=value, where var is the name of one	of  the	 vari‐
		       ables  described below. The specified assignment occurs
		       before  executing  the  nawk  program,  including   the
		       actions associated with BEGIN patterns (if any). Multi‐
		       ple occurrences of this option can be specified.

OPERANDS
       The following operands are supported:

       program	       If no -f option is specified, the first operand to nawk
		       is  the	text of the nawk program. The application sup‐
		       plies the program operand as a single argument to nawk.
		       If  the	text does not end in a newline character, nawk
		       interprets the text as if it did.

       argument	       Either of the following two types of  argument  can  be
		       intermixed:

		       file

			   A  pathname of a file that contains the input to be
			   read, which is matched against the set of  patterns
			   in  the program. If no file operands are specified,
			   or if a file operand is −, the  standard  input  is
			   used.

		       assignment

			   An operand that begins with an underscore or alpha‐
			   betic character from the  portable  character  set,
			   followed  by	 a sequence of underscores, digits and
			   alphabetics from the portable character  set,  fol‐
			   lowed  by  the  =  character	 specifies  a variable
			   assignment rather than a pathname.  The  characters
			   before the = represent the name of a nawk variable.
			   If that name is a nawk reserved word, the  behavior
			   is  undefined.  The	characters following the equal
			   sign is interpreted as if they appeared in the nawk
			   program preceded and followed by a double-quote (")
			   character, as a STRING token , except that  if  the
			   last	 character  is	an  unescaped backslash, it is
			   interpreted as a literal backslash rather  than  as
			   the	first character of the sequence "\". The vari‐
			   able is assigned the value of that STRING token. If
			   the	value is considered a numericstring, the vari‐
			   able is assigned its numeric value. Each such vari‐
			   able	 assignment  is performed just before the pro‐
			   cessing of the following file,  if  any.  Thus,  an
			   assignment  before  the first file argument is exe‐
			   cuted after the BEGIN actions (if  any),  while  an
			   assignment after the last file argument is executed
			   before the END actions (if any).  If there  are  no
			   file	 arguments,  assignments  are  executed before
			   processing the standard input.

INPUT FILES
       Input files to the nawk program from any of the following sources:

	 ·  any file operands or their equivalents, achieved by modifying  the
	    nawk variables ARGV and ARGC

	 ·  standard input in the absence of any file operands

	 ·  arguments to the getline function

       must  be	 text  files.  Whether the variable RS is set to a value other
       than a newline character or not, for these files, implementations  sup‐
       port  records  terminated with the specified separator up to {LINE_MAX}
       bytes and may support longer records.

       If -f progfile is specified, the files named by each  of	 the  progfile
       option-arguments must be text files containing an nawk program.

       The  standard input are used only if no file operands are specified, or
       if a file operand is −.

EXTENDED DESCRIPTION
       A nawk program is composed of pairs of the form:

       pattern { action }

       Either the pattern or the action (including the enclosing brace charac‐
       ters)  can  be  omitted.	 Pattern-action	 statements are separated by a
       semicolon or by a newline.

       A missing pattern matches any record of input, and a missing action  is
       equivalent  to  an  action  that	 writes the matched record of input to
       standard output.

       Execution of the nawk program starts by	first  executing  the  actions
       associated  with all BEGIN patterns in the order they occur in the pro‐
       gram. Then each file operand (or standard input if no files were speci‐
       fied) is processed by reading data from the file until a record separa‐
       tor is seen (a newline character by  default),  splitting  the  current
       record  into fields using the current value of FS, evaluating each pat‐
       tern in the program in the  order  of  occurrence,  and	executing  the
       action  associated  with	 each pattern that matches the current record.
       The action for a matching pattern is executed before evaluating	subse‐
       quent  patterns.	 Last, the actions associated with all END patterns is
       executed in the order they occur in the program.

   Expressions in nawk
       Expressions describe computations used in patterns and actions. In  the
       following  table,  valid expression operations are given in groups from
       highest precedence first to lowest precedence last,  with  equal-prece‐
       dence operators grouped between horizontal lines. In expression evalua‐
       tion, where the grammar is formally ambiguous, higher precedence opera‐
       tors  are  evaluated  before lower precedence operators.	 In this table
       expr, expr1, expr2, and expr3 represent any  expression,	 while	lvalue
       represents  any	entity	that  can be assigned to (that is, on the left
       side of an assignment operator).

       Syntax		 Name			    Type of Result     Associativity
       ( expr )		 Grouping		    type of expr	n/a
       $expr		 Field reference	    string	       n/a
       ++ lvalue	 Pre-increment		    numeric	       n/a
	−−lvalue	 Pre-decrement		    numeric	       n/a
       lvalue ++	 Post-increment		    numeric	       n/a
       lvalue −−	 Post-decrement		    numeric	       n/a
       expr ^
       expr		 Exponentiation		    numeric	       right
       ! expr		 Logical not		    numeric	       n/a
       + expr		 Unary plus		    numeric	       n/a
       − expr		 Unary minus		    numeric	       n/a
	expr * expr	 Multiplication		    numeric	       left

       expr / expr	 Division		    numeric	       left
       expr % expr	 Modulus		    numeric	       left
       expr + expr	 Addition		    numeric	       left
       expr −
       expr		 Subtraction		    numeric	       left
       expr expr	 String concatenation	    string	       left
       expr < expr	 Less than		    numeric	       none
       expr <= expr	 Less than or equal to	    numeric	       none
       expr != expr	 Not equal to		    numeric	       none
       expr  == expr	 Equal to		    numeric	       none
       expr > expr	 Greater than		    numeric	       none
       expr >= expr	 Greater than or equal to   numeric	       none
       expr ~ expr	 ERE match		    numeric	       none
       expr !~ expr	 ERE non-match		     numeric	       none
       expr in array	 Array membership	    numeric	       left
       ( index ) in	 Multi-dimension array	    numeric	       left
	   array	     membership
       expr &&
       expr		 Logical AND		    numeric	       left
       expr ||
       expr		 Logical OR		    numeric	       left
       expr1 ?
       expr2		 Conditional expression	    type of selected   right
	   : expr3				       expr2 or
       expr3
       lvalue ^=
       expr		 Exponentiation		    numeric	       right
			 assignment
       lvalue %= expr	 Modulus assignment	    numeric	       right
       lvalue *= expr	 Multiplication		    numeric	       right
			 assignment
       lvalue /= expr	 Division assignment	    numeric	       right
       lvalue +=  expr	 Addition assignment	    numeric	       right
       lvalue −=
       expr		 Subtraction assignment	    numeric	       right
       lvalue =
       expr		 Assignment		    type of expr       right

       Each expression has either a string value, a  numeric  value  or	 both.
       Except  as  stated for specific contexts, the value of an expression is
       implicitly converted to the type needed for the context in which it  is
       used.  A string value is converted to a numeric value by the equivalent
       of the following calls:

       setlocale(LC_NUMERIC, "");
       numeric_value = atof(string_value);

       A numeric value that is exactly equal to the value  of  an  integer  is
       converted  to a string by the equivalent of a call to the sprintf func‐
       tion with the string %d as the fmt argument and the numeric value being
       converted as the first and only expr argument.  Any other numeric value
       is converted to a string by the equivalent of a	call  to  the  sprintf
       function with the value of the variable CONVFMT as the fmt argument and
       the numeric value being converted as the first and only expr argument.

       A string value is considered to be a numeric string  in	the  following
       case:

       1.  Any leading and trailing blank characters is ignored.

       2.  If the first unignored character is a + or −, it is ignored.

       3.  If the remaining unignored characters would be lexically recognized
	   as a NUMBER token, the string is considered a numeric string.

       If a − character is ignored in the above steps, the  numeric  value  of
       the  numeric  string is the negation of the numeric value of the recog‐
       nized NUMBER token. Otherwise the numeric value of the  numeric	string
       is  the	numeric value of the recognized NUMBER token. Whether or not a
       string is a numeric string is relevant only in contexts where that term
       is used in this section.

       When  an	 expression  is used in a Boolean context, if it has a numeric
       value, a value of zero is treated as  false  and	 any  other  value  is
       treated	as  true.  Otherwise,  a  string  value	 of the null string is
       treated as false and any other value is treated as true. A Boolean con‐
       text is one of the following:

	 ·  the first subexpression of a conditional expression.

	 ·  an	expression operated on by logical NOT, logical AND, or logical
	    OR.

	 ·  the second expression of a for statement.

	 ·  the expression of an if statement.

	 ·  the expression of the while clause in either a  while  or  do  ...
	    while statement.

	 ·  an expression used as a pattern (as in Overall Program Structure).

       The  nawk language supplies arrays that are used for storing numbers or
       strings. Arrays need not be declared. They  are	initially  empty,  and
       their  sizes  changes  dynamically.  The subscripts, or element identi‐
       fiers, are strings, providing a type of associative  array  capability.
       An  array  name	followed  by a subscript within square brackets can be
       used as an lvalue and as an expression, as described  in	 the  grammar.
       Unsubscripted array names are used in only the following contexts:

	 ·  a parameter in a function definition or function call.

	 ·  the NAME token following any use of the keyword in.

       A  valid	 array	index  consists of one or more comma-separated expres‐
       sions, similar to the way in which multi-dimensional arrays are indexed
       in  some	 programming  languages.  Because  nawk arrays are really one-
       dimensional, such a comma-separated  list  is  converted	 to  a	single
       string  by concatenating the string values of the separate expressions,
       each separated from the other by the value of the SUBSEP variable.

       Thus, the following two index operations are equivalent:

       var[expr1, expr2, ... exprn]
       var[expr1 SUBSEP expr2 SUBSEP ... SUBSEP exprn]

       A multi-dimensioned index used with the in  operator  must  be  put  in
       parentheses.  The  in operator, which tests for the existence of a par‐
       ticular array element, does not create  the  element  if	 it  does  not
       exist.	Any  other reference to a non-existent array element automati‐
       cally creates it.

   Variables and Special Variables
       Variables can be used in an nawk program by referencing them. With  the
       exception  of  function	parameters,  they are not explicitly declared.
       Uninitialized scalar variables and array elements have both  a  numeric
       value of zero and a string value of the empty string.

       Field variables are designated by a $ followed by a number or numerical
       expression. The effect of the field  number  expression	evaluating  to
       anything	 other	than a non-negative integer is unspecified. Uninitial‐
       ized variables or string values need not be converted to numeric values
       in  this	 context. New field variables are created by assigning a value
       to them. References to non-existent fields (that is, fields after  $NF)
       produce	the  null  string.  However, assigning to a non-existent field
       (for example, $(NF+2) = 5) increases the value of NF, create any inter‐
       vening  fields with the null string as their values and cause the value
       of $0 to be recomputed, with the fields being separated by the value of
       OFS.  Each  field  variable  has	 a  string  value when created. If the
       string, with any occurrence of the  decimal-point  character  from  the
       current	locale	changed to a period character, is considered a numeric
       string (see Expressions in nawk above), the field variable also has the
       numeric value of the numeric string.

   /usr/bin/nawk, /usr/xpg4/bin/awk
       nawk  sets  the	following special variables that are supported by both
       /usr/bin/nawk and /usr/xpg4/bin/awk:

       ARGC	       The number of elements in the ARGV array.

       ARGV	       An array of command line arguments,  excluding  options
		       and the program argument, numbered from zero to ARGC−1.

		       The arguments in ARGV can be modified or added to; ARGC
		       can be altered.	As each input file ends,  nawk	treats
		       the  next  non-null  element of ARGV, up to the current
		       value of ARGC−1, inclusive, as the  name	 of  the  next
		       input  file.   Setting an element of ARGV to null means
		       that it is not treated as an input  file.  The  name  −
		       indicates  the  standard	 input. If an argument matches
		       the format of an assignment operand, this  argument  is
		       treated as an assignment rather than a file argument.

       ENVIRON	       The variable ENVIRON is an array representing the value
		       of the  environment.  The  indices  of  the  array  are
		       strings	consisting  of	the  names  of the environment
		       variables, and the value of each	 array	element	 is  a
		       string consisting of the value of that variable. If the
		       value  of  an  environment  variable  is	 considered  a
		       numeric	string, the array element also has its numeric
		       value.

		       In all cases where nawk behavior is affected  by	 envi‐
		       ronment	variables  (including  the  environment of any
		       commands that nawk executes via the system function  or
		       via pipeline redirections with the print statement, the
		       printf statement, or the getline function),  the	 envi‐
		       ronment	used is the environment at the time nawk began
		       executing.

       FILENAME	       A pathname of the current input file.  Inside  a	 BEGIN
		       action the value is undefined. Inside an END action the
		       value is the name of the last input file processed.

       FNR	       The ordinal number of the current record in the current
		       file.  Inside  a BEGIN action the value is zero. Inside
		       an END action the value	is  the	 number	 of  the  last
		       record processed in the last file processed.

       FS	       Input field separator regular expression; a space char‐
		       acter by default.

       NF	       The number of fields in the current  record.  Inside  a
		       BEGIN  action, the use of NF is undefined unless a get‐
		       line function without a var argument is executed previ‐
		       ously.  Inside  an  END action, NF retains the value it
		       had for the last	 record	 read,	unless	a  subsequent,
		       redirected,  getline function without a var argument is
		       performed prior to entering the END action.

       NR	       The ordinal number of the current record from the start
		       of  input.  Inside  a  BEGIN  action the value is zero.
		       Inside an END action the value is  the  number  of  the
		       last record processed.

       OFMT	       The  printf format for converting numbers to strings in
		       output statements "%.6g" by default. The result of  the
		       conversion is unspecified if the value of OFMT is not a
		       floating-point format specification.

       OFS	       The print statement output  field  separator;  a	 space
		       character by default.

       ORS	       The  print output record separator; a newline character
		       by default.

       LENGTH	       The length of the string matched by the match function.

       RS	       The first character of the string value of  RS  is  the
		       input record separator; a newline character by default.
		       If RS contains more than one character, the results are
		       unspecified.  If RS is null, then records are separated
		       by sequences of one or more  blank  lines.  Leading  or
		       trailing	 blank	lines  do not produce empty records at
		       the beginning or end of input, and the field  separator
		       is always newline, no matter what the value of FS.

       RSTART	       The  starting  position	of  the	 string matched by the
		       match function, numbering from 1. This is always equiv‐
		       alent to the return value of the match function.

       SUBSEP	       The  subscript  separator  string for multi-dimensional
		       arrays. The default value is \034.

   /usr/xpg4/bin/awk
       The following variable is supported for /usr/xpg4/bin/awk only:

       CONVFMT	       The printf format for  converting  numbers  to  strings
		       (except for output statements, where OFMT is used). The
		       default is %.6g.

   Regular Expressions
       The nawk utility makes use of the extended regular expression  notation
       (see  regex(5)) except that it allows the use of C-language conventions
       to escape special characters within the EREs, namely \\,	 \a,  \b,  \f,
       \n,  \r,	 \t,  \v,  and	those specified in the following table.	 These
       escape sequences are recognized both inside and outside bracket expres‐
       sions.	Note  that records need not be separated by newline characters
       and string constants can contain newline characters,  so	 even  the  \n
       sequence	 is  valid  in	nawk EREs.  Using a slash character within the
       regular expression requires escaping as shown in the table below:

       Escape Sequence	       Description		     Meaning
	     \"		 Backslash quotation-mark   Quotation-mark character
	     \/		 Backslash slash	    Slash character
	    \ddd	 A  backslash	character   The character encoded by
			 followed  by the longest   the one-, two- or three-
			 sequence of one, two, or   digit   octal   integer.
			 three	octal-digit char‐   Multi-byte	  characters
			 acters	 (01234567).   If   require  multiple,	con‐
			 all of the digits are 0,   catenated	      escape
			 (that is, representation   sequences, including the
			 of  the NULL character),   leading \ for each byte.
			 the  behavior	is  unde‐
			 fined.
	     \c		 A  backslash	character   Undefined
			 followed  by any charac‐
			 ter  not  described   in
			 this  table  or  special
			 characters (\\, \a,  \b,
			 \f, \n, \r, \t, \v).

       A  regular expression can be matched against a specific field or string
       by using one of the two regular expression matching  operators,	~  and
       !~.  These  operators  interpret	 their right-hand operand as a regular
       expression and their left-hand operand as  a  string.  If  the  regular
       expression  matches the string, the ~ expression evaluates to the value
       1, and the !~ expression evaluates to  the  value  0.  If  the  regular
       expression does not match the string, the ~ expression evaluates to the
       value 0, and the !~ expression evaluates to the value 1. If the	right-
       hand  operand  is  any expression other than the lexical token ERE, the
       string value of the expression is interpreted as	 an  extended  regular
       expression,  including  the  escape conventions described above. Notice
       that these same escape conventions also are applied in the  determining
       the  value  of  a  string  literal  (the	 lexical token STRING), and is
       applied a second time when a string literal is used in this context.

       When an ERE token appears as an expression in any context other than as
       the  right-hand of the ~ or !~ operator or as one of the built-in func‐
       tion arguments described below, the value of the	 resulting  expression
       is the equivalent of:

       $0 ~ /ere/

       The ere argument to the gsub, match, sub functions, and the fs argument
       to the split function (see String Functions) is interpreted as extended
       regular	expressions.  These  can  be  either  ERE  tokens or arbitrary
       expressions, and are interpreted in the same manner as  the  right-hand
       side of the ~ or !~ operator.

       An  extended regular expression can be used to separate fields by using
       the -F ERE option or by assigning a string containing the expression to
       the  built-in  variable	FS.  The default value of the FS variable is a
       single space character. The following describes FS behavior:

       1.  If FS is a single character:

	     ·	If FS is the space character, skip leading and trailing	 blank
		characters;  fields are delimited by sets of one or more blank
		characters.

	     ·	Otherwise, if FS is any other character c, fields  are	delim‐
		ited by each single occurrence of c.

       2.  Otherwise,  the  string value of FS is considered to be an extended
	   regular expression. Each occurrence	of  a  sequence	 matching  the
	   extended regular expression delimits fields.

       Except  in  the gsub, match, split, and sub built-in functions, regular
       expression matching is based on input records. That is, record  separa‐
       tor  characters (the first character of the value of the variable RS, a
       newline character by default) cannot be embedded in the expression, and
       no  expression  matches	the  record separator character. If the record
       separator is not a newline character, newline  characters  embedded  in
       the  expression can be matched. In those four built-in functions, regu‐
       lar expression matching are based on text strings.  So,	any  character
       (including  the	newline	 character  and	 the  record separator) can be
       embedded in the pattern and an appropriate pattern will match any char‐
       acter. However, in all nawk regular expression matching, the use of one
       or more NUL characters in the pattern, input record or text string pro‐
       duces undefined results.

   Patterns
       A pattern is any valid expression, a range specified by two expressions
       separated by comma, or one of the two special patterns BEGIN or END.

   Special Patterns
       The nawk utility recognizes two special patterns, BEGIN and  END.  Each
       BEGIN pattern is matched once and its associated action executed before
       the first record of input is read (except possibly by use of  the  get‐
       line  function in a prior BEGIN action) and before command line assign‐
       ment is done. Each END pattern  is  matched  once  and  its  associated
       action executed after the last record of input has been read. These two
       patterns have associated actions.

       BEGIN and END do not combine with other patterns.  Multiple  BEGIN  and
       END  patterns  are  allowed. The actions associated with the BEGIN pat‐
       terns are executed in the order specified in the program,  as  are  the
       END actions. An END pattern can precede a BEGIN pattern in a program.

       If an nawk program consists of only actions with the pattern BEGIN, and
       the BEGIN action contains no getline function, nawk exits without read‐
       ing  its input when the last statement in the last BEGIN action is exe‐
       cuted. If an nawk program consists of only actions with the pattern END
       or  only	 actions  with	the  patterns BEGIN and END, the input is read
       before the statements in the END actions are executed.

   Expression Patterns
       An expression pattern is evaluated as if it were	 an  expression	 in  a
       Boolean	context.  If  the result is true, the pattern is considered to
       match, and the associated action (if any) is executed. If the result is
       false, the action is not executed.

   Pattern Ranges
       A  pattern  range  consists of two expressions separated by a comma. In
       this case, the action is performed for all records between a  match  of
       the  first expression and the following match of the second expression,
       inclusive. At this point, the pattern range can be repeated starting at
       input records subsequent to the end of the matched range.

   Actions
       An  action  is  a sequence of statements. A statement may be one of the
       following:

       if ( expression ) statement [ else statement ]
       while ( expression ) statement
       do statement while ( expression )
       for ( expression ; expression ; expression ) statement
       for ( var in array ) statement
       delete array[subscript] #delete an array element
       break
       continue
       { [ statement ] ... }
       expression	 # commonly variable = expression
       print [ expression-list ] [ >expression ]
       printf format [ ,expression-list ] [ >expression ]
       next		 # skip remaining patterns on this input line
       exit [expr] # skip the rest of the input; exit status is expr
       return [expr]

       Any single statement can be replaced by a statement  list  enclosed  in
       braces.	 The  statements are terminated by newline characters or semi‐
       colons, and are executed sequentially in the order that they appear.

       The next statement causes all further processing of the	current	 input
       record  to  be abandoned. The behavior is undefined if a next statement
       appears or is invoked in a BEGIN or END action.

       The exit statement invokes all END actions in the order in  which  they
       occur  in  the  program	source	and then terminate the program without
       reading further input. An exit statement inside an  END	action	termi‐
       nates  the  program  without  further  execution of END actions.	 If an
       expression is specified in an exit statement, its numeric value is  the
       exit status of nawk, unless subsequent errors are encountered or a sub‐
       sequent exit statement with an expression is executed.

   Output Statements
       Both print and printf statements write to standard output  by  default.
       The  output  is written to the location specified by output_redirection
       if one is supplied, as follows:

       > expression
       >> expression
       | expression

       In all cases, the expression is evaluated to produce a string  that  is
       used  as a full pathname to write into (for > or >>) or as a command to
       be executed (for |). Using the first two forms, if  the	file  of  that
       name  is not currently open, it is opened, creating it if necessary and
       using the first form, truncating the file. The output then is  appended
       to  the	file.	As  long as the file remains open, subsequent calls in
       which expression evaluates to the same string value simply appends out‐
       put  to the file. The file remains open until the close function, which
       is called with an expression that evaluates to the same string value.

       The third form writes output onto a stream piped to the input of a com‐
       mand.  The  stream  is  created if no stream is currently open with the
       value of expression as its command name.	 The stream created is equiva‐
       lent  to one created by a call to the popen(3C) function with the value
       of expression as the command argument and a value  of  w	 as  the  mode
       argument.   As  long  as	 the  stream remains open, subsequent calls in
       which expression evaluates to the same string value  writes  output  to
       the  existing stream. The stream will remain open until the close func‐
       tion is called with an expression that evaluates	 to  the  same	string
       value.	At  that  time,	 the  stream  is closed as if by a call to the
       pclose function.

       These output statements take a comma-separated  list  of	 expression  s
       referred	  in  the  grammar  by	the  non-terminal  symbols  expr_list,
       print_expr_list or print_expr_list_opt. This list is referred  to  here
       as the expression list, and each member is referred to as an expression
       argument.

       The print statement writes the value of each expression	argument  onto
       the indicated output stream separated by the current output field sepa‐
       rator (see variable OFS above), and terminated  by  the	output	record
       separator  (see	variable ORS above). All expression arguments is taken
       as strings, being converted if necessary; with the exception  that  the
       printf format in OFMT is used instead of the value in CONVFMT. An empty
       expression list stands for the whole input record ($0).

       The printf statement produces output based on a notation similar to the
       File  Format  Notation  used  to describe file formats in this document
       Output is produced as specified with the first expression  argument  as
       the  string  format  and subsequent expression arguments as the strings
       arg1 to argn, inclusive, with the following exceptions:

       1.  The format is an actual character string rather  than  a  graphical
	   representation.  Therefore, it cannot contain empty character posi‐
	   tions. The space character in the format  string,  in  any  context
	   other  than	a flag of a conversion specification, is treated as an
	   ordinary character that is copied to the output.

       2.  If the character set contains a Delta character and that  character
	   appears  in the format string, it is treated as an ordinary charac‐
	   ter that is copied to the output.

       3.  The escape  sequences  beginning  with  a  backslash	 character  is
	   treated  as sequences of ordinary characters that are copied to the
	   output. Note that these same sequences is interpreted lexically  by
	   nawk	 when  they appear in literal strings, but they is not treated
	   specially by the printf statement.

       4.  A field width or precision can be  specified	 as  the  *  character
	   instead  of a digit string. In this case the next argument from the
	   expression list is fetched and its numeric value taken as the field
	   width or precision.

       5.  The	implementation does not precede or follow output from the d or
	   u conversion specifications with blank characters not specified  by
	   the format string.

       6.  The	implementation	does  not precede output from the o conversion
	   specification with  leading	zeros  not  specified  by  the	format
	   string.

       7.  For	the  c conversion specification: if the argument has a numeric
	   value, the character whose encoding is that value  is  output.   If
	   the	value  is  zero or is not the encoding of any character in the
	   character set, the behavior is undefined.  If the argument does not
	   have	 a numeric value, the first character of the string value will
	   be output; if the string does not contain any characters the behav‐
	   ior is undefined.

       8.  For	each  conversion  specification that consumes an argument, the
	   next expression argument will be evaluated. With the	 exception  of
	   the	c  conversion,	the value will be converted to the appropriate
	   type for the conversion specification.

       9.  If there are insufficient expression arguments to satisfy  all  the
	   conversion  specifications  in  the	format string, the behavior is
	   undefined.

       10. If any character sequence in the format  string  begins  with  a  %
	   character,  but does not form a valid conversion specification, the
	   behavior is unspecified.

       Both print and printf can output at least {LINE_MAX} bytes.

   Functions
       The nawk language has a	variety	 of  built-in  functions:  arithmetic,
       string, input/output and general.

   Arithmetic Functions
       The  arithmetic functions, except for int, are based on the ISO C stan‐
       dard. The behavior is undefined in cases where the ISO C standard spec‐
       ifies  that  an	error  be  returned or that the behavior is undefined.
       Although the grammar permits built-in functions to appear with no argu‐
       ments  or parentheses, unless the argument or parentheses are indicated
       as optional in the following list (by displaying them within  the  [  ]
       brackets), such use is undefined.

       atan2(y,x)      Return arctangent of y/x.

       cos(x)	       Return cosine of x, where x is in radians.

       sin(x)	       Return sine of x, where x is in radians.

       exp(x)	       Return the exponential function of x.

       log(x)	       Return the natural logarithm of x.

       sqrt(x)	       Return the square root of x.

       int(x)	       Truncate	 its  argument to an integer. It will be trun‐
		       cated toward 0 when x > 0.

       rand()	       Return a random number n, such that 0 ≤ n < 1.

       srand([expr])   Set the seed value for rand to expr or use the time  of
		       day if expr is omitted. The previous seed value will be
		       returned.

   String Functions
       The string functions in the following list shall be supported. Although
       the  grammar  permits built-in functions to appear with no arguments or
       parentheses, unless  the	 argument  or  parentheses  are	 indicated  as
       optional	 in  the  following  list  (by	displaying them within the [ ]
       brackets), such use is undefined.

       gsub(ere,repl[,in])	       Behave like  sub	 (see  below),	except
				       that it will replace all occurrences of
				       the regular  expression	(like  the  ed
				       utility	global substitute) in $0 or in
				       the in argument, when specified.

       index(s,t)		       Return  the  position,  in  characters,
				       numbering  from	1,  in	string s where
				       string t first occurs, or  zero	if  it
				       does not occur at all.

       length[([s])]		       Return  the  length,  in characters, of
				       its argument taken as a string,	or  of
				       the  whole  record,  $0, if there is no
				       argument.

       match(s,ere)		       Return  the  position,  in  characters,
				       numbering from 1, in string s where the
				       extended regular expression ere occurs,
				       or  zero	 if  it does not occur at all.
				       RSTART will  be	set  to	 the  starting
				       position	 (which	 is  the  same	as the
				       returned value), zero if	 no  match  is
				       found;  RLENGTH	will  be  set  to  the
				       length of the matched string, −1 if  no
				       match is found.

       split(s,a[,fs])		       Split  the string s into array elements
				       a[1], a[2], ..., a[n],  and  return  n.
				       The  separation	will  be done with the
				       extended regular expression fs or  with
				       the  field  separator  FS  if fs is not
				       given. Each array element will  have  a
				       string	value  when  created.  If  the
				       string assigned to any  array  element,
				       with  any  occurrence  of  the decimal-
				       point character from the current locale
				       changed to a period character, would be
				       considered a numeric string; the	 array
				       element	will  also  have  the  numeric
				       value of the numeric string. The effect
				       of  a null string as the value of fs is
				       unspecified.

       sprintf(fmt,expr,expr,...)      Format the expressions according to the
				       printf  format  given by fmt and return
				       the resulting string.

       sub(ere,repl[,in])	       Substitute the string repl in place  of
				       the first instance of the extended reg‐
				       ular expression ERE in  string  in  and
				       return  the number of substitutions. An
				       ampersand ( & ) appearing in the string
				       repl  will  be  replaced	 by the string
				       from  in	 that  matches	 the   regular
				       expression.   For  each	occurrence  of
				       backslash (\) encountered when scanning
				       the  string repl from beginning to end,
				       the next character is  taken  literally
				       and  loses  its	special	 meaning  (for
				       example, \& will be  interpreted	 as  a
				       literal	ampersand  character).	Except
				       for & and \, it is unspecified what the
				       special	meaning	 of any such character
				       is. If in is specified and it is not an
				       lvalue the behavior is undefined. If in
				       is omitted, nawk will substitute in the
				       current record ($0).

       substr(s,m[,n])		       Return  the  at	most  n-character sub‐
				       string of s that begins at position  m,
				       numbering  from 1. If n is missing, the
				       length of the substring will be limited
				       by the length of the string s.

       tolower(s)		       Return  a string based on the string s.
				       Each character in s that is  an	upper-
				       case letter specified to have a tolower
				       mapping by the LC_CTYPE category of the
				       current	locale will be replaced in the
				       returned string by the lower-case  let‐
				       ter  specified  by  the	mapping. Other
				       characters in s will  be	 unchanged  in
				       the returned string.

       toupper(s)		       Return  a string based on the string s.
				       Each character in s that	 is  a	lower-
				       case letter specified to have a toupper
				       mapping by the LC_CTYPE category of the
				       current	locale will be replaced in the
				       returned string by the upper-case  let‐
				       ter  specified  by  the	mapping. Other
				       characters in s will  be	 unchanged  in
				       the returned string.

       All  of	the  preceding functions that take ERE as a parameter expect a
       pattern or a string valued expression that is a regular	expression  as
       defined below.

   Input/Output and General Functions
       The input/output and general functions are:

       close(expression)	       Close  the  file	 or  pipe  opened by a
				       print or printf statement or a call  to
				       getline	with  the  same	 string-valued
				       expression. If the close	 was  success‐
				       ful, the function will return 0; other‐
				       wise, it will return non-zero.

       expression|getline[var]	       Read a record of input  from  a	stream
				       piped from the output of a command. The
				       stream will be created if no stream  is
				       currently   open	  with	the  value  of
				       expression as  its  command  name.  The
				       stream  created	will  be equivalent to
				       one created by  a  call	to  the	 popen
				       function	 with  the value of expression
				       as the command argument and a value  of
				       r  as the mode argument. As long as the
				       stream remains open,  subsequent	 calls
				       in  which  expression  evaluates to the
				       same string value will read  subsequent
				       records	from the file. The stream will
				       remain open until the close function is
				       called  with  an expression that evalu‐
				       ates to the same string value. At  that
				       time,  the  stream will be closed as if
				       by a call to the	 pclose	 function.  If
				       var  is missing, $0 and NF will be set;
				       otherwise, var will be set.

				       The getline operator can form ambiguous
				       constructs  when	 there	are  operators
				       that are not in parentheses  (including
				       concatenate)  to	 the left of the | (to
				       the beginning of	 the  expression  con‐
				       taining getline). In the context of the
				       $ operator, | behaves as if  it	had  a
				       lower  precedence than $. The result of
				       evaluating other operators is  unspeci‐
				       fied,  and  all	such  uses of portable
				       applications must be put in parentheses
				       properly.

       getline			       Set  $0	to  the next input record from
				       the current input file.	This  form  of
				       getline	will  set  the NF, NR, and FNR
				       variables.

       getline var		       Set variable  var  to  the  next	 input
				       record  from  the  current  input file.
				       This form of getline will set  the  FNR
				       and NR variables.

       getline [var] < expression      Read  the  next	record of input from a
				       named  file.  The  expression  will  be
				       evaluated  to  produce a string that is
				       used as a full pathname. If the file of
				       that  name  is  not  currently open, it
				       will be opened. As long as  the	stream
				       remains open, subsequent calls in which
				       expression evaluates to the same string
				       value will read subsequent records from
				       the file. The  file  will  remain  open
				       until the close function is called with
				       an expression  that  evaluates  to  the
				       same  string  value. If var is missing,
				       $0 and NF will be set;  otherwise,  var
				       will be set.

				       The getline operator can form ambiguous
				       constructs when there are binary opera‐
				       tors   that   are  not  in  parentheses
				       (including concatenate) to the right of
				       the  < (up to the end of the expression
				       containing the getline). The result  of
				       evaluating such a construct is unspeci‐
				       fied, and all  such  uses  of  portable
				       applications must be put in parentheses
				       properly.

       system(expression)	       Execute the command given by expression
				       in  a  manner  equivalent  to  the sys‐
				       tem(3C) function and  return  the  exit
				       status of the command.

       All  forms  of getline will return 1 for successful input, 0 for end of
       file, and −1 for an error.

       Where strings are used as the name of a file or pipeline,  the  strings
       must  be	 textually  identical.	The  terminology ``same string value''
       implies that ``equivalent strings'', even those	that  differ  only  by
       space characters, represent different files.

   User-defined Functions
       The  nawk language also provides user-defined functions. Such functions
       can be defined as:

       function name(args,...) { statements }

       A function can be referred to anywhere in an nawk program; in  particu‐
       lar,  its  use can precede its definition. The scope of a function will
       be global.

       Function arguments can be either scalars or  arrays;  the  behavior  is
       undefined  if  an array name is passed as an argument that the function
       uses as a scalar, or if a scalar expression is passed  as  an  argument
       that  the  function uses as an array. Function arguments will be passed
       by value if scalar and by reference if array name. Argument names  will
       be  local to the function; all other variable names will be global. The
       same name will not be used as both an argument name and as the name  of
       a  function  or a special nawk variable. The same name must not be used
       both as a variable name with global scope and as the name  of  a	 func‐
       tion.  The  same	 name must not be used within the same scope both as a
       scalar variable and as an array.

       The number of parameters in the function definition need not match  the
       number of parameters in the function call. Excess formal parameters can
       be used as local variables. If fewer arguments are supplied in a	 func‐
       tion  call  than	 are  in the function definition, the extra parameters
       that are used in the function body as scalars will be initialized  with
       a  string value of the null string and a numeric value of zero, and the
       extra parameters that are used in the function body as arrays  will  be
       initialized  as empty arrays. If more arguments are supplied in a func‐
       tion call than are in the function definition, the  behavior  is	 unde‐
       fined.

       When  invoking  a  function,  no	 white space can be placed between the
       function name and the opening parenthesis. Function calls can be nested
       and  recursive  calls  can be made upon functions. Upon return from any
       nested or recursive function call, the values of	 all  of  the  calling
       function's  parameters  will  be unchanged, except for array parameters
       passed by reference. The return statement  can  be  used	 to  return  a
       value.  If a return statement appears outside of a function definition,
       the behavior is undefined.

       In the function definition, newline characters are optional before  the
       opening	brace  and  after  the closing brace. Function definitions can
       appear anywhere in the program where a pattern-action pair is allowed.

USAGE
       The index, length, match, and substr functions should not  be  confused
       with  similar  functions	 in the ISO C standard; the nawk versions deal
       with characters, while the ISO C standard deals with bytes.

       Because the concatenation operation is represented by adjacent  expres‐
       sions  rather  than  an explicit operator, it is often necessary to use
       parentheses to enforce the proper evaluation precedence.

       See largefile(5) for the description  of	 the  behavior	of  nawk  when
       encountering files greater than or equal to 2 Gbyte (2**31 bytes).

EXAMPLES
       The nawk program specified in the command line is most easily specified
       within single-quotes (for example, 'program')  for  applications	 using
       sh,  because nawk programs commonly contain characters that are special
       to the shell, including double-quotes. In the cases where a  nawk  pro‐
       gram contains single-quote characters, it is usually easiest to specify
       most of the program as strings within single-quotes concatenated by the
       shell with quoted single-quote characters.  For example:

       nawk '/'\''/ { print "quote:", $0 }'

       prints  all  lines  from	 the  standard input containing a single-quote
       character, prefixed with quote:.

       The following are examples of simple nawk programs:

       Example 1: Write to the standard output all input lines for which field
       3 is greater than 5:

       $3 > 5

       Example 2: Write every tenth line:

       (NR % 10) == 0

       Example 3: Write any line with a substring matching the regular expres‐
       sion:

       /(G|D)(2[0-9][[:alpha:]]*)/

       Example 4: Print any line with a substring containing a G  or  D,  fol‐
       lowed by a sequence of digits and characters:

       This  example uses character classes digit and alpha to match language-
       independent digit and alphabetic characters, respectively.

       /(G|D)([[:digit:][:alpha:]]*)/

       Example 5: Write any line in which the second field matches the regular
       expression and the fourth field does not:

       $2 ~ /xyz/ && $4 !~ /xyz/

       Example	6:  Write  any line in which the second field contains a back‐
       slash:

       $2 ~ /\\/

       Example 7: Write any line in which the second field  contains  a	 back‐
       slash (alternate method):

       Notice  that  backslash	escapes are interpreted twice, once in lexical
       processing of the string and once in processing the regular expression.

       $2 ~ "\\\\"

       Example 8: Write the second to the last and  the	 last  field  in  each
       line, separating the fields by a colon:

       {OFS=":";print $(NF-1), $NF}

       Example 9: Write the line number and number of fields in each line:

       The  three strings representing the line number, the colon and the num‐
       ber of fields are concatenated and that string is written  to  standard
       output.

       {print NR ":" NF}

       Example 10: Write lines longer than 72 characters:

       {length($0) > 72}

       Example	11:  Write first two fields in opposite order separated by the
       OFS:

       { print $2, $1 }

       Example 12: Same, with input fields separated by comma or space and tab
       characters, or both:

       BEGIN { FS = ",[\t]*|[\t]+" }
	     { print $2, $1 }

       Example 13: Add up first column, print sum and average:

	   {s += $1 }
       END {print "sum is ", s, " average is", s/NR}

       Example 14: Write fields in reverse order, one per line (many lines out
       for each line in):

       { for (i = NF; i > 0; --i) print $i }

       Example 15: Write all lines between occurrences of the strings  "start"
       and "stop":

       /start/, /stop/

       Example	16:  Write  all	 lines whose first field is different from the
       previous one:

       $1 != prev { print; prev = $1 }

       Example 17: Simulate the echo command:

       BEGIN  {
	      for (i = 1; i < ARGC; ++i)
		    printf "%s%s", ARGV[i], i==ARGC-1?"\n":""
	      }

       Example 18: Write the path prefixes contained in the  PATH  environment
       variable, one per line:

       BEGIN  {
	      n = split (ENVIRON["PATH"], path, ":")
	      for (i = 1; i <= n; ++i)
		     print path[i]
	      }

       Example 19: Print the file "input", filling in page numbers starting at
       5:

       If there is a file named input containing page headers of the form

       Page#

       and a file named program that contains

       /Page/{ $2 = n++; }
       { print }

       then the command line

       nawk -f program n=5 input

       will print the file input, filling in page numbers starting at 5.

ENVIRONMENT VARIABLES
       See environ(5) for descriptions of the following environment  variables
       that affect execution: LC_COLLATE, LC_CTYPE, LC_MESSAGES, and NLSPATH.

       LC_NUMERIC      Determine  the  radix  character used when interpreting
		       numeric input, performing conversions  between  numeric
		       and   string  values  and  formatting  numeric  output.
		       Regardless of locale, the period character  (the	 deci‐
		       mal-point  character  of the POSIX locale) is the deci‐
		       mal-point character recognized in processing  awk  pro‐
		       grams  (including  assignments  in  command-line	 argu‐
		       ments).

EXIT STATUS
       The following exit values are returned:

       0	All input files were processed successfully.

       >0	An error occurred.

       The exit status can be altered within the  program  by  using  an  exit
       expression.

ATTRIBUTES
       See attributes(5) for descriptions of the following attributes:

   /usr/bin/nawk
       ┌─────────────────────────────┬─────────────────────────────┐
       │      ATTRIBUTE TYPE	     │	    ATTRIBUTE VALUE	   │
       ├─────────────────────────────┼─────────────────────────────┤
       │Availability		     │SUNWcsu			   │
       └─────────────────────────────┴─────────────────────────────┘

   /usr/xpg4/bin/awk
       ┌─────────────────────────────┬─────────────────────────────┐
       │      ATTRIBUTE TYPE	     │	    ATTRIBUTE VALUE	   │
       ├─────────────────────────────┼─────────────────────────────┤
       │Availability		     │SUNWxcu4			   │
       └─────────────────────────────┴─────────────────────────────┘

SEE ALSO
       awk(1),	 ed(1),	  egrep(1),   grep(1),	lex(1),	 sed(1),  popen	 (3C),
       printf(3C),  system(3C),	  attributes(5),   environ(5),	 largefile(5),
       regex(5), XPG4(5)

       Aho,  A. V., B. W. Kernighan, and P. J. Weinberger, The AWK Programming
       Language, Addison-Wesley, 1988.

DIAGNOSTICS
       If any file operand is specified and the named file cannot be accessed,
       nawk  will  write  a diagnostic message to standard error and terminate
       without any further action.

       If the program specified by either the program operand  or  a  progfile
       operand	is not a valid nawk program (as specified in EXTENDED DESCRIP‐
       TION), the behavior is undefined.

NOTES
       Input white space is not preserved on output if fields are involved.

       There are no explicit conversions between numbers and strings. To force
       an  expression to be treated as a number add 0 to it; to force it to be
       treated as a string concatenate the null string ("") to it.

SunOS 5.10			  17 Jun 2005			       nawk(1)
[top]

List of man pages available for SunOS

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net