advance_r man page on DigitalUNIX

Man page or keyword search:  
man Server   12896 pages
apropos Keyword Search (all sections)
Output format
DigitalUNIX logo
[printable version]

regexp(3)							     regexp(3)

       advance,	 advance_r, compile, compile_r, step, step_r - Regular expres‐
       sion compile and match routines

       #define INIT declarations #define GETC getc  code  #define  PEEKC  peek
       code  #define  UNGETC(c)	 ungetc	 code  #define RETURN(ptr) return code
       #define ERROR(val) error code

       #include <regexp.h>

       char *compile(
	       char *instring,
	       char *expbuf,
	       const char *endbuf,
	       int eof ); int step(
	       const char *string,
	       const char *expbuf ); int advance(
	       const char *string,
	       const char *expbuf );

       extern char *loc1, *loc2, *locs;

       The following functions do not conform to  current  standards  and  are
       supported only for backward compatibility: char *compile_r(
	       char *instring,
	       char *expbuf,
	       char *endbuf,
	       int eof,
	       struct regexp_data *regexp_data ); int advance_r(
	       char *string,
	       char *expbuf,
	       struct regexp_data *regexp_data ); int step_r(
	       char *string,
	       char *expbuf,
	       struct regexp_data *regexp_data );

       Interfaces  documented on this reference page conform to industry stan‐
       dards as follows:

       advance(), compile(), step(): XSH4.2

       Refer to the standards(5) reference page	 for  more  information	 about
       industry standards and associated tags.

       The  value  of the next character (byte) in the regular expression pat‐
       tern. Returned by the next call	to  the	 GETC()	 and  PEEKC()  macros.
       Specifies  a  pointer  to the character following the last character of
       the compiled regular expression.	 Specifies an error value.   Specifies
       a string to be passed to the compile() function.

	      The instring parameter is never used explicitly by the compile()
	      function, but you can use it in your macros.  For	 example,  you
	      may want to pass the string containing a pattern as the instring
	      parameter to the compile() function and use the INIT() macro  to
	      set  a pointer to the beginning of this string. When your macros
	      do not use instring, call the compile() function with a value of
	      ((char  *)  0)  for this parameter.  Points to a character array
	      where the compiled regular expression is stored.	Points to  the
	      location	that immediately follows the character array where the
	      compiled regular expression is stored. When the compiled expres‐
	      sion  cannot  be contained in (endbuf-expbuf) number of bytes, a
	      call to the ERROR(_BIGREGEXP) macro is made (see the ERRORS sec‐
	      tion).   Specifies the character that marks the end of the regu‐
	      lar expression. For example, in ed this character is usually a /
	      (slash).	 Points	 to a NULL terminated string of characters, in
	      the step() function, to be searched for a match.	 Is  data  for
	      the compile_r(), step_r(), and advance_r() functions.

       The  compile(),	advance(),  and step() functions are used for general-
       purpose expression matching.

       The compile() function takes a simple regular expression as  input  and
       produces	 a  compiled  expression  that can be used with the step() and
       advance() functions.

       The following six macros, used  in  the	compile()  function,  must  be
       defined	before	the  #include  <regexp.h>  statement  in programs. The
       GETC(), PEEKC(), and UNGETC() macros operate on the regular  expression
       provided as input for the compile() function.  The INIT() macro is used
       for dependent declarations and initializations.	In the regexp.h header
       file  this macro is located right after the compile() function declara‐
       tions and opening { (left brace). Your  INIT()  declarations  must  end
       with a ; (semicolon).

	      The  INIT()  macro is frequently used to set a register variable
	      to point to the beginning of the	regular	 expression,  so  that
	      this  pointer  can  be used in declarations for GETC(), PEEKC(),
	      and UNGETC(). Alternatively,  you	 can  use  INIT()  to  declare
	      external variables that GETC(), PEEKC(), and UNGETC() need.  The
	      GETC() macro returns the value of the next character  (byte)  in
	      the  regular-expression  pattern.	 Successive  calls  to	GETC()
	      return successive characters of  the  regular  expression.   The
	      PEEKC()  macro  returns the next character (byte) in the regular
	      expression.  Immediate subsequent calls to this macro return the
	      same  byte,  which  is  also  the next character returned by the
	      GETC() macro.  The UNGETC() macro causes the c parameter	to  be
	      returned	by  the next call to the GETC() and PEEKC() macros. No
	      more than one character of pushback is ever needed because  this
	      character	 is  guaranteed	 to  be the last character read by the
	      GETC() macro. The value of the UNGETC() macro is always ignored.
	      The  RETURN()  macro  is	used  for normal exit of the compile()
	      function. The value of the ptr parameter is  a  pointer  to  the
	      character	 following  the last character of the compiled regular
	      expression. This is useful in programs that manage memory	 allo‐
	      cation.	The ERROR() macro is the abnormal return from the com‐
	      pile() function. A call to this  macro  should  never  return  a
	      value. In this macro, val is an error number, which is described
	      in the ERRORS section of this reference page.

       The step() function finds the first substring of the  string  parameter
       that  matches  the compiled expression pointed to by the expbuf parame‐
       ter. When there is no match, the step() function returns a value	 of  0
       (zero).	 When  there is a match, the step() function returns a nonzero
       value and sets two global character pointers: loc1, which points to the
       first  character	 of  the substring that matches the pattern, and loc2,
       which points to the character immediately following the substring  that
       matches	the  pattern.  When  the regular expression matches the entire
       expression, loc1 points to the first character of the string  parameter
       and  loc2  points  to  the  NULL character at the end of the expression
       specified by the string parameter.

       The step() function uses the integer variable circf, which  is  set  by
       the  compile()  function	 when  the  regular expression begins with a ^
       (circumflex).  When this variable is  set,  the	step()	function  only
       tries  to  match the regular expression to the beginning of the string.
       When you compile more than one regular expression before executing  the
       first one, save the value of circf for each compiled expression and set
       circf to the saved value before each call to step().

       The advance() function tests whether an initial substring of the string
       parameter  matches  the	expression pointed to by the expbuf parameter.
       Using the same parameters that were passed to it, the  step()  function
       calls  the advance() function. The step() function increments a pointer
       through the string parameter characters and  calls  advance()  until  a
       nonzero	value,	which indicates a match, is returned, or until the end
       of the expression pointed to by the string  parameter  is  reached.  To
       unconditionally	constrain  string  to  point  to  the beginning of the
       expression, call the advance() function	directly  instead  of  calling

       When  the  advance()  function  encounters  an  *  (asterisk) or a \{\}
       sequence in the regular expression, it  advances	 its  pointer  to  the
       string  to  be matched as far as possible and recursively calls itself,
       trying to match the remainder of the regular  expression.  As  long  as
       there  is  no  match,  the advance() function backs up along the string
       until the function finds a match or reaches the	point  in  the	string
       where the initial match with the * or \{\} character occurred.

       It  is  sometimes  desirable to stop this backing up before the initial
       pointer position in the string is reached. When the locs global charac‐
       ter  pointer  is	 matched with the character at the pointer position in
       the string during the backing-up process, the advance() function breaks
       out of the recursive loop that backs up and returns the value 0 (zero).

       The  compile_r(), step_r(), and advance_r() functions are the reentrant
       versions of the compile(), step(), and advance()	 functions.  They  are
       supported  in  order  to maintain backward compatibility with operating
       system versions prior to Tru64 UNIX Version 4.0.

       The regexp.h header file defines the regexp_data structure.

       This interface has been deprecated in favor of the regcomp()  interface
       specified by the POSIX and X/Open standards and may be retired. If pos‐
       sible, you should migrate regexp() regular expression routines  to  the
       routines offered under the regcomp() and regexec() interfaces (see reg‐

       The regexp interface is provided to support System V applications. Tra‐
       ditional	 BSD  applications use different functions for regular expres‐
       sion handling. See the re_comp(3) and re_exec(3) reference pages.

       The advance(), compile(), and step()  functions	are  scheduled	to  be
       withdrawn from a future version of the X/Open CAE Specification.

       Upon  successful	 completion, the compile() function calls the RETURN()
       macro. Upon failure, this function calls the ERROR() macro.

       Whenever a successful match occurs, the step() and advance()  functions
       return a nonzero value. Upon failure, these functions return a value of
       0 (zero).

       [Tru64 UNIX]  The  compile_r(),	step_r(),  and	advance_r()  functions
       return the same values as their non-reentrant counterparts.

       If any of the following conditions occurs, the compile() or compile_r()
       functions call the ERROR() macro with an error value as	its  argument:
       The  range endpoint is too large.  A bad number was received.  The num‐
       ber in \digit is out of range.  There is an illegal or  missing	delim‐
       iter.   There  is no remembered search string.  The use of a pair of \(
       and \) is unbalanced.  There are too many \( and \) pairs (exceeds  the
       maximum	value  set  for	 _NBRA in regexp.h, usually 9).	 More than two
       numbers are given in the \{ and \} pair.	 A }  character	 was  expected
       after  a \.  The first number exceeds the second in the \{ and \} pair.
       There is a [ ] pair imbalance.  There is a regular expression overflow.
       [Tru64 UNIX]  There was an unknown error.

       The  following is an example of the regular expression macros and calls
       from the grep command:

       #define	INIT	      register	 char	*sp=instring;	#define	  GETC
       (*sp++)	#define	 PEEKC	      (*sp) #define UNGETC(c)	(--sp) #define
       RETURN(c)   return; #define ERROR(c)    regerr

       #include <regexp.h>
	       . . .

       compile (patstr, expbuf, &expbuf[ESIZE], '\0');
	       . . .

       if (step (linebuf, expbuf))
	       succeed( );
	       . . .

       Functions: ctype(3), fnmatch(3), glob(3), regcomp(3), re_comp(3)

       Commands: ed(1), sed(1), grep(1)

       Standards: standards(5)


List of man pages available for DigitalUNIX

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
Vote for polarhome
Free Shell Accounts :: the biggest list on the net