pcregrep man page on OpenBSD

Man page or keyword search:  
man Server   11362 pages
apropos Keyword Search (all sections)
Output format
OpenBSD logo
[printable version]



PCREGREP(1)					      PCREGREP(1)

NAME
       pcregrep - a grep with Perl-compatible regular expressions.

SYNOPSIS
       pcregrep [options] [long options] [pattern] [path1 path2 ...]

DESCRIPTION

       pcregrep	 searches  files  for  character  patterns, in the same way as
       other grep commands do, but it uses the PCRE regular expression library
       to support patterns that are compatible with the regular expressions of
       Perl 5. See pcrepattern(3) for a full description of syntax and	seman-
       tics of the regular expressions that PCRE supports.

       Patterns,  whether  supplied on the command line or in a separate file,
       are given without delimiters. For example:

	 pcregrep Thursday /etc/motd

       If you attempt to use delimiters (for example, by surrounding a pattern
       with  slashes,  as  is common in Perl scripts), they are interpreted as
       part of the pattern. Quotes can of course be used to  delimit  patterns
       on  the	command	 line  because	they are interpreted by the shell, and
       indeed they are required if a pattern contains  white  space  or	 shell
       metacharacters.

       The  first  argument that follows any option settings is treated as the
       single pattern to be matched when neither -e nor -f is  present.	  Con-
       versely,	 when  one  or	both of these options are used to specify pat-
       terns, all arguments are treated as path names. At least one of -e, -f,
       or an argument pattern must be provided.

       If no files are specified, pcregrep reads the standard input. The stan-
       dard input can also be referenced by a  name  consisting	 of  a	single
       hyphen.	For example:

	 pcregrep some-pattern /file1 - /file3

       By  default, each line that matches a pattern is copied to the standard
       output, and if there is more than one file, the file name is output  at
       the start of each line, followed by a colon. However, there are options
       that can change how pcregrep behaves.  In  particular,  the  -M	option
       makes  it  possible  to	search for patterns that span line boundaries.
       What defines a line  boundary  is  controlled  by  the  -N  (--newline)
       option.

       Patterns	 are  limited  to  8K  or  BUFSIZ characters, whichever is the
       greater.	 BUFSIZ is defined in <stdio.h>. When there is more  than  one
       pattern (specified by the use of -e and/or -f), each pattern is applied
       to each line in the order in which they are defined,  except  that  all
       the -e patterns are tried before the -f patterns.

       By  default,  as soon as one pattern matches (or fails to match when -v
       is used), no further patterns are considered. However, if --colour  (or

								1

PCREGREP(1)					      PCREGREP(1)

       --color) is used to colour the matching substrings, or if --only-match-
       ing, --file-offsets, or --line-offsets is used to output only the  part
       of  the	line  that  matched (either shown literally, or as an offset),
       scanning resumes immediately  following	the  match,  so	 that  further
       matches	on the same line can be found. If there are multiple patterns,
       they are all tried on the remainder of the line, but patterns that fol-
       low the one that matched are not tried on the earlier part of the line.

       This is the same behaviour as GNU grep, but it does mean that the order
       in which multiple patterns are specified can affect the output when one
       of the above options is used.

       Patterns that can match an empty string are accepted, but empty	string
       matches	 are   never   recognized.   An	  example   is	 the   pattern
       "(super)?(man)?", in which all components are  optional.	 This  pattern
       finds  all  occurrences	of  both "super" and "man"; the output differs
       from matching with "super|man" when only the  matching  substrings  are
       being shown.

       If  the	LC_ALL	or LC_CTYPE environment variable is set, pcregrep uses
       the value to set a locale when calling the PCRE library.	 The  --locale
       option can be used to override this.

SUPPORT FOR COMPRESSED FILES

       It  is  possible	 to compile pcregrep so that it uses libz or libbz2 to
       read files whose names end in .gz or .bz2, respectively. You  can  find
       out whether your binary has support for one or both of these file types
       by running it with the --help option. If the appropriate support is not
       present,	 files are treated as plain text. The standard input is always
       so treated.

OPTIONS

       The order in which some of the options appear can  affect  the  output.
       For  example,  both  the	 -h and -l options affect the printing of file
       names. Whichever comes later in the command line will be the  one  that
       takes effect.

       --	 This  terminate the list of options. It is useful if the next
		 item on the command line starts with a hyphen but is  not  an
		 option.  This allows for the processing of patterns and file-
		 names that start with hyphens.

       -A number, --after-context=number
		 Output number lines of context after each matching  line.  If
		 filenames and/or line numbers are being output, a hyphen sep-
		 arator is used instead of a colon for the  context  lines.  A
		 line  containing  "--" is output between each group of lines,
		 unless they are in fact contiguous in	the  input  file.  The
		 value	of number is expected to be relatively small. However,
		 pcregrep guarantees to have up to 8K of following text avail-
		 able for context output.

								2

PCREGREP(1)					      PCREGREP(1)

       -B number, --before-context=number
		 Output	 number lines of context before each matching line. If
		 filenames and/or line numbers are being output, a hyphen sep-
		 arator	 is  used  instead of a colon for the context lines. A
		 line containing "--" is output between each group  of	lines,
		 unless	 they  are  in	fact contiguous in the input file. The
		 value of number is expected to be relatively small.  However,
		 pcregrep guarantees to have up to 8K of preceding text avail-
		 able for context output.

       -C number, --context=number
		 Output number lines of context both  before  and  after  each
		 matching  line.  This is equivalent to setting both -A and -B
		 to the same value.

       -c, --count
		 Do not output individual lines from the files that are	 being
		 scanned; instead output the number of lines that would other-
		 wise have been shown. If no lines are	selected,  the	number
		 zero  is  output.  If	several files are are being scanned, a
		 count is output for each of them. However,  if	 the  --files-
		 with-matches  option  is  also	 used,	only those files whose
		 counts are greater than zero are listed. When -c is used, the
		 -A, -B, and -C options are ignored.

       --colour, --color
		 If this option is given without any data, it is equivalent to
		 "--colour=auto".  If data is required, it must	 be  given  in
		 the same shell item, separated by an equals sign.

       --colour=value, --color=value
		 This option specifies under what circumstances the parts of a
		 line that matched a pattern should be coloured in the output.
		 By  default,  the output is not coloured. The value (which is
		 optional, see above) may be "never", "always", or "auto".  In
		 the  latter case, colouring happens only if the standard out-
		 put is connected to a terminal. More resources are used  when
		 colouring  is enabled, because pcregrep has to search for all
		 possible matches in a line, not just one, in order to	colour
		 them all.

		 The colour that is used can be specified by setting the envi-
		 ronment variable PCREGREP_COLOUR or PCREGREP_COLOR. The value
		 of this variable should be a string of two numbers, separated
		 by a semicolon. They are copied  directly  into  the  control
		 string	 for  setting  colour  on  a  terminal,	 so it is your
		 responsibility to ensure that they make sense. If neither  of
		 the  environment  variables  is  set,	the default is "1;31",
		 which gives red.

       -D action, --devices=action
		 If an input path is  not  a  regular  file  or	 a  directory,
		 "action"  specifies  how  it is to be processed. Valid values
		 are "read" (the default) or "skip" (silently skip the	path).

								3

PCREGREP(1)					      PCREGREP(1)

       -d action, --directories=action
		 If an input path is a directory, "action" specifies how it is
		 to be processed.  Valid  values  are  "read"  (the  default),
		 "recurse"  (equivalent to the -r option), or "skip" (silently
		 skip the path). In the default case, directories are read  as
		 if  they  were	 ordinary files. In some operating systems the
		 effect of reading a directory like this is an immediate  end-
		 of-file.

       -e pattern, --regex=pattern, --regexp=pattern
		 Specify a pattern to be matched. This option can be used mul-
		 tiple times in order to specify several patterns. It can also
		 be  used  as a way of specifying a single pattern that starts
		 with a hyphen. When -e is used, no argument pattern is	 taken
		 from  the  command  line;  all	 arguments are treated as file
		 names. There is an overall maximum of 100 patterns. They  are
		 applied  to  each line in the order in which they are defined
		 until one matches (or fails to match if -v is used). If -f is
		 used  with  -e,  the command line patterns are matched first,
		 followed by the patterns from the file,  independent  of  the
		 order	in which these options are specified. Note that multi-
		 ple use of -e is not the same as a single pattern with alter-
		 natives. For example, X|Y finds the first character in a line
		 that is X or Y, whereas if the two patterns are  given	 sepa-
		 rately, pcregrep finds X if it is present, even if it follows
		 Y in the line. It finds Y only if there is no X in the	 line.
		 This  really  matters	only  if  you are using -o to show the
		 part(s) of the line that matched.

       --exclude=pattern
		 When pcregrep is searching the files in a directory as a con-
		 sequence  of  the  -r	(recursive search) option, any regular
		 files whose names match the pattern are excluded. Subdirecto-
		 ries  are  not	 excluded  by  this  option; they are searched
		 recursively, subject to the --exclude_dir  and	 --include_dir
		 options.  The	pattern	 is  a PCRE regular expression, and is
		 matched against the final component of the file name (not the
		 entire	 path).	 If  a	file  name  matches both --include and
		 --exclude, it is excluded.  There is no short form  for  this
		 option.

       --exclude_dir=pattern
		 When  pcregrep	 is searching the contents of a directory as a
		 consequence of the -r (recursive search) option,  any	subdi-
		 rectories  whose  names match the pattern are excluded. (Note
		 that the --exclude option does	 not  affect  subdirectories.)
		 The  pattern  is  a  PCRE  regular expression, and is matched
		 against the final component  of  the  name  (not  the	entire
		 path).	 If a subdirectory name matches both --include_dir and
		 --exclude_dir, it is excluded. There is  no  short  form  for
		 this option.

       -F, --fixed-strings
		 Interpret  each pattern as a list of fixed strings, separated

								4

PCREGREP(1)					      PCREGREP(1)

		 by newlines, instead of  as  a	 regular  expression.  The  -w
		 (match	 as  a	word) and -x (match whole line) options can be
		 used with -F. They apply to each of the fixed strings. A line
		 is selected if any of the fixed strings are found in it (sub-
		 ject to -w or -x, if present).

       -f filename, --file=filename
		 Read a number of patterns from the file, one  per  line,  and
		 match	them against each line of input. A data line is output
		 if any of the patterns match it. The filename can be given as
		 "-" to refer to the standard input. When -f is used, patterns
		 specified on the command line using -e may also  be  present;
		 they are tested before the file's patterns. However, no other
		 pattern is taken from the command  line;  all	arguments  are
		 treated  as  file  names.  There is an overall maximum of 100
		 patterns. Trailing white space is removed from each line, and
		 blank	lines  are ignored. An empty file contains no patterns
		 and therefore matches nothing. See also  the  comments	 about
		 multiple  patterns  versus a single pattern with alternatives
		 in the description of -e above.

       --file-offsets
		 Instead of showing lines or parts of lines that  match,  show
		 each  match  as  an  offset  from the start of the file and a
		 length, separated by a comma. In this	mode,  no  context  is
		 shown.	 That  is,  the -A, -B, and -C options are ignored. If
		 there is more than one match in a line, each of them is shown
		 separately.  This  option  is mutually exclusive with --line-
		 offsets and --only-matching.

       -H, --with-filename
		 Force the inclusion of the filename at the  start  of	output
		 lines	when searching a single file. By default, the filename
		 is not shown in this case. For matching lines,	 the  filename
		 is followed by a colon; for context lines, a hyphen separator
		 is used. If a line number is also being  output,  it  follows
		 the file name.

       -h, --no-filename
		 Suppress  the output filenames when searching multiple files.
		 By default, filenames	are  shown  when  multiple  files  are
		 searched.  For	 matching lines, the filename is followed by a
		 colon; for context lines, a hyphen separator is used.	 If  a
		 line number is also being output, it follows the file name.

       --help	 Output	 a  help  message, giving brief details of the command
		 options and file type support, and then exit.

       -i, --ignore-case
		 Ignore upper/lower case distinctions during comparisons.

       --include=pattern
		 When pcregrep is searching the files in a directory as a con-
		 sequence  of  the  -r	(recursive  search) option, only those

								5

PCREGREP(1)					      PCREGREP(1)

		 regular files whose names match  the  pattern	are  included.
		 Subdirectories	 are always included and searched recursively,
		 subject to the --include_dir and --exclude_dir	 options.  The
		 pattern  is a PCRE regular expression, and is matched against
		 the final component of the file name (not the	entire	path).
		 If  a	file  name matches both --include and --exclude, it is
		 excluded. There is no short form for this option.

       --include_dir=pattern
		 When pcregrep is searching the contents of a directory	 as  a
		 consequence  of  the -r (recursive search) option, only those
		 subdirectories whose names match the  pattern	are  included.
		 (Note	that  the --include option does not affect subdirecto-
		 ries.) The pattern is	a  PCRE	 regular  expression,  and  is
		 matched  against  the	final  component  of the name (not the
		 entire	 path).	 If   a	  subdirectory	 name	matches	  both
		 --include_dir	and --exclude_dir, it is excluded. There is no
		 short form for this option.

       -L, --files-without-match
		 Instead of outputting lines from the files, just  output  the
		 names	of  the files that do not contain any lines that would
		 have been output. Each file name is output once, on  a	 sepa-
		 rate line.

       -l, --files-with-matches
		 Instead  of  outputting lines from the files, just output the
		 names of the files containing lines that would have been out-
		 put.  Each  file  name	 is  output  once, on a separate line.
		 Searching normally stops as soon as a matching line is	 found
		 in  a	file.  However, if the -c (count) option is also used,
		 matching continues in order to obtain the correct count,  and
		 those	files  that  have  at least one match are listed along
		 with their counts. Using this option with -c is a way of sup-
		 pressing the listing of files with no matches.

       --label=name
		 This option supplies a name to be used for the standard input
		 when file names are being output. If not supplied, "(standard
		 input)" is used. There is no short form for this option.

       --line-offsets
		 Instead  of  showing lines or parts of lines that match, show
		 each match as a line number, the offset from the start of the
		 line,	and a length. The line number is terminated by a colon
		 (as usual; see the -n option), and the offset and length  are
		 separated  by	a  comma.  In  this mode, no context is shown.
		 That is, the -A, -B, and -C options are ignored. If there  is
		 more  than  one  match in a line, each of them is shown sepa-
		 rately. This option is mutually exclusive with --file-offsets
		 and --only-matching.

       --locale=locale-name
		 This  option  specifies  a  locale  to	 be  used  for pattern

								6

PCREGREP(1)					      PCREGREP(1)

		 matching. It overrides the value in the  LC_ALL  or  LC_CTYPE
		 environment  variables.  If  no locale is specified, the PCRE
		 library's default (usually the "C" locale) is used. There  is
		 no short form for this option.

       -M, --multiline
		 Allow	patterns to match more than one line. When this option
		 is given, patterns may usefully contain literal newline char-
		 acters	 and  internal	occurrences of ^ and $ characters. The
		 output for any one match may consist of more than  one	 line.
		 When  this option is set, the PCRE library is called in "mul-
		 tiline" mode.	There is a limit to the number of  lines  that
		 can  be matched, imposed by the way that pcregrep buffers the
		 input file as it scans it. However, pcregrep ensures that  at
		 least 8K characters or the rest of the document (whichever is
		 the shorter) are available for forward	 matching,  and	 simi-
		 larly the previous 8K characters (or all the previous charac-
		 ters, if fewer than 8K) are guaranteed to  be	available  for
		 lookbehind assertions.

       -N newline-type, --newline=newline-type
		 The  PCRE  library  supports  five  different conventions for
		 indicating the ends of lines. They are	 the  single-character
		 sequences  CR	(carriage  return) and LF (linefeed), the two-
		 character sequence CRLF, an "anycrlf" convention, which  rec-
		 ognizes  any  of the preceding three types, and an "any" con-
		 vention, in which any Unicode line ending sequence is assumed
		 to  end a line. The Unicode sequences are the three just men-
		 tioned,  plus	VT  (vertical  tab,  U+000B),  FF   (formfeed,
		 U+000C),   NEL	 (next	line,  U+0085),	 LS  (line  separator,
		 U+2028), and PS (paragraph separator, U+2029).

		 When  the  PCRE  library  is  built,  a  default  line-ending
		 sequence   is	specified.   This  is  normally	 the  standard
		 sequence for the operating system. Unless otherwise specified
		 by  this  option,  pcregrep  uses the library's default.  The
		 possible values for this option are CR, LF, CRLF, ANYCRLF, or
		 ANY.  This  makes  it	possible to use pcregrep on files that
		 have come from other environments without  having  to	modify
		 their	line  endings.	If the data that is being scanned does
		 not agree with the convention set by  this  option,  pcregrep
		 may behave in strange ways.

       -n, --line-number
		 Precede each output line by its line number in the file, fol-
		 lowed by a colon for matching lines or a hyphen  for  context
		 lines.	 If the filename is also being output, it precedes the
		 line number. This option is forced if --line-offsets is used.

       -o, --only-matching
		 Show  only  the  part	of the line that matched a pattern. In
		 this mode, no context is shown. That is, the -A, -B,  and  -C
		 options  are  ignored.	 If  there is more than one match in a
		 line, each of them is shown separately.  If  -o  is  combined

								7

PCREGREP(1)					      PCREGREP(1)

		 with  -v  (invert the sense of the match to find non-matching
		 lines), no output is generated, but the return	 code  is  set
		 appropriately. This option is mutually exclusive with --file-
		 offsets and --line-offsets.

       -q, --quiet
		 Work quietly, that is, display nothing except error messages.
		 The  exit  status  indicates  whether or not any matches were
		 found.

       -r, --recursive
		 If any given path is a directory, recursively scan the	 files
		 it  contains, taking note of any --include and --exclude set-
		 tings. By default, a directory is read as a normal  file;  in
		 some  operating  systems this gives an immediate end-of-file.
		 This option is a shorthand  for  setting  the	-d  option  to
		 "recurse".

       -s, --no-messages
		 Suppress  error  messages  about  non-existent	 or unreadable
		 files. Such files are quietly skipped.	 However,  the	return
		 code is still 2, even if matches were found in other files.

       -u, --utf-8
		 Operate  in UTF-8 mode. This option is available only if PCRE
		 has been compiled with UTF-8 support. Both patterns and  sub-
		 ject lines must be valid strings of UTF-8 characters.

       -V, --version
		 Write	the  version  numbers of pcregrep and the PCRE library
		 that is being used to the standard error stream.

       -v, --invert-match
		 Invert the sense of the match, so that	 lines	which  do  not
		 match any of the patterns are the ones that are found.

       -w, --word-regex, --word-regexp
		 Force the patterns to match only whole words. This is equiva-
		 lent to having \b at the start and end of the pattern.

       -x, --line-regex, --line-regexp
		 Force the patterns to be anchored (each must  start  matching
		 at  the beginning of a line) and in addition, require them to
		 match entire lines. This is equivalent	 to  having  ^	and  $
		 characters at the start and end of each alternative branch in
		 every pattern.

ENVIRONMENT VARIABLES

       The environment variables LC_ALL and LC_CTYPE  are  examined,  in  that
       order,  for  a  locale.	The first one that is set is used. This can be
       overridden by the --locale option.  If  no  locale  is  set,  the  PCRE
       library's default (usually the "C" locale) is used.

								8

PCREGREP(1)					      PCREGREP(1)

NEWLINES

       The  -N (--newline) option allows pcregrep to scan files with different
       newline conventions from the default.  However,	the  setting  of  this
       option  does not affect the way in which pcregrep writes information to
       the standard error and output streams. It uses the  string  "\n"	 in  C
       printf()	 calls	to  indicate newlines, relying on the C I/O library to
       convert this to an appropriate sequence if the  output  is  sent	 to  a
       file.

OPTIONS COMPATIBILITY

       The majority of short and long forms of pcregrep's options are the same
       as in the GNU grep program. Any long option of  the  form  --xxx-regexp
       (GNU  terminology) is also available as --xxx-regex (PCRE terminology).
       However, the --locale, -M, --multiline, -u,  and	 --utf-8  options  are
       specific to pcregrep. If both the -c and -l options are given, GNU grep
       lists only file names, without counts, but pcregrep gives the counts.

OPTIONS WITH DATA

       There are four different ways in which an option with data can be spec-
       ified.	If  a  short  form option is used, the data may follow immedi-
       ately, or in the next command line item. For example:

	 -f/some/file
	 -f /some/file

       If a long form option is used, the data may appear in the same  command
       line item, separated by an equals character, or (with one exception) it
       may appear in the next command line item. For example:

	 --file=/some/file
	 --file /some/file

       Note, however, that if you want to supply a file name beginning with  ~
       as  data	 in  a	shell  command,	 and have the shell expand ~ to a home
       directory, you must separate the file name from the option, because the
       shell  does not treat ~ specially unless it is at the start of an item.

       The exception to the above is the --colour  (or	--color)  option,  for
       which  the  data is optional. If this option does have data, it must be
       given in the first form, using an equals character. Otherwise  it  will
       be assumed that it has no data.

MATCHING ERRORS

       It  is  possible	 to supply a regular expression that takes a very long
       time to fail to match certain lines.  Such  patterns  normally  involve
       nested  indefinite repeats, for example: (a+)*\d when matched against a
       line of a's with no final digit.	 The  PCRE  matching  function	has  a
       resource	 limit that causes it to abort in these circumstances. If this
       happens, pcregrep outputs an error message and the line that caused the
       problem	to  the	 standard error stream. If there are more than 20 such

								9

PCREGREP(1)					      PCREGREP(1)

       errors, pcregrep gives up.

DIAGNOSTICS

       Exit status is 0 if any matches were found, 1 if no matches were found,
       and  2 for syntax errors and non-existent or inacessible files (even if
       matches were found in other files) or too many matching	errors.	 Using
       the  -s	option to suppress error messages about inaccessble files does
       not affect the return code.

SEE ALSO

       pcrepattern(3), pcretest(1).

AUTHOR

       Philip Hazel
       University Computing Service
       Cambridge CB2 3QH, England.

REVISION

       Last updated: 13 September 2009
       Copyright (c) 1997-2009 University of Cambridge.

							       10

[top]

List of man pages available for OpenBSD

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net