Library /sys$common/syshlp/lse$menu.hlb Patterns *Conan The Librarian (sorry for the slow response - running on an old VAX) |
The three pattern styles VMS, ULTRIX and TPU can be used in search and substitute commands. The pattern style is set by the SET SEARCH PATTERN command. 1. VMS style patterns The VMS pattern style enables the special interpretation of wildcard characters and a quote character in the search-string parameter as shown below: VMS-Style Wildcards Wildcard Matches * One or more characters of any kind on a line. ** One or more characters of any kind crossing lines. % A single character. \< Beginning of a line. \> End of a line. \[set-of-characters] Any character in the specified set. For example, \[abc] matches any letter in the set "abc" and \[c-t] matches any letter in the set "c" through "t." \[~set-of-characters] Anything not in the specified set of characters. \ Lets you specify the characters \,*,% or ] within wildcard expressions. For example, \\ matches the backslash character (\). \. Repeats the previous pattern zero or more times, including the original. \: Repeats the previous pattern at least once, including the original; that is, a null occurrence does not match. \w Any empty space created by the space bar or tab stops, including no more than one line break. \d Any decimal digit. \o Any octal digit. \x Any hexadecimal digit. \a Any alphabetic character, including accented letters, other marked letters, and non-English letters. \n Any alphanumeric character. \s Any character that can be used in a symbol: alphanumeric, dollar sign, and underscore. \l Any lowercase letter. \u Any uppercase letter. \p Any punctuation character. \f Any formatting characters: backspace, tab, line feed, vertical tab, form feed, and carriage return. \^ Any control character. \+ Any character with bit 7 set; that is, ASCII decimal values from 128 through 255. For example the following command will find a line starting with an uppercase letter: PATTERN SEARCH "\<\u" 2. ULTRIX style patterns The ULTRIX pattern style enables the special interpretation of wildcard characters and a quote character in the search-string parameter as shown below: ULTRIX-Style Wildcards Wildcard Matches . A single character. ^ Beginning of a line. $ End of a line. [set-of-characters] Any character in the specified set. For example, [abc] matches any letter in the set "abc" and [c-t] matches any letter in the set "c" through "t." [^set-of-characters] Anything not in the specified set of characters. \ Lets you specify the characters \,.,^,$,[,],or * in wildcard expressions. For example, \\ matches the backslash character (\). * Repeats the previous pattern zero or more times, including the original. + Repeats the previous pattern at least once, including the original; that is, a null occurrence does not match. For example the following command will find a line starting with a, b or c: PATTERN SEARCH "^[abc]" 3. TPU style patterns The TPU pattern style enables the use of TPU patterns. For full details of TPU patterns see the DEC Text Processing Utility Manual. 3.1 Simple examples The first example searches for abc or def and the second example substitutes all occurrences of abc or def by ghi: PATTERN SEARCH "'abc' | 'def'" PATTERN SUBSTITUTE "'abc' | 'def'" "'ghi'" ALL In the examples 'abc', 'def' and 'ghi' are TPU strings and | is the TPU pattern alternation operator. The outermost quotes in the examples must be omitted if the parameters are prompted for or if a dialog box is used. 3.2 Search string The search string is a TPU expression that must evaluate to a TPU pattern. 3.3 Replace string The replace string is a TPU expression that must evaluate to a TPU string. 3.4 Partial pattern assignment variables Partial pattern assignment variables allow a substitution to be a function of the found pattern. For example, the following command replaces a date of the form yyyy/mm/dd with one of the form dd/mm/yyyy: PATTERN SUBSTITUTE - "(_year@_v1)+'/'+(_month@_v2)+'/'+(_day@_v3)" - "str(_v3)+'/'+str(_v2)+'/'+ str(_v1)" when applied to: 1998/04/21 generates: 21/04/1998 In the above example _year, _month and _day are TPU variables holding patterns that match the year, month and day parts of a date, for details of how to set up these variables see Section 3.8. @ is the TPU partial pattern assignment operator and _v1, _v2 and _v3 are partial pattern assignment variables that are set to the found year, date and day. A partial pattern assignment variable holds a TPU range and when used in the replacement string must be converted to a string using the TPU procedure STR. For example, the following command will prefix any lines that start with any three characters from ABCDEFGHI with XYZ_ : PATTERN SUBSTITUTE - "LINE_BEGIN + (ANY('ABCDEFGHI',3)@_v1)" - "'XYZ_'+ str(_v1)" - ALL before substitution abc 012 defghi after substitution XYZ_abc 012 XYZ_defghi In the above example LINE_BEGIN is a TPU keyword that matches the beginning of a line and ANY is a TPU pattern procedure that matches a specified number of characters from a specified set of characters. 3.5 New line A new line will be generated for each line feed character in the replacement string, a line feed character can be introduced by means of the TPU procedure ASCII with the value 10 as a parameter. For example, to replace any numbers at the end of lines with the string 'xxx' (a line feed is necessary because the search pattern includes the end of the line): PATTERN SUBSTITUTE - "_n + LINE_END" - "'xxx' + ASCII(10)" - ALL before substitution 123 456 789 after substitution 123 xxx xxx In the above example LINE_END is a TPU keyword that matches the end of a line and _n is TPU variable holding a pattern that matches a number. When a partial pattern assignment variable is converted to a string by the TPU procedure STR an optional second parameter can be set to ASCII(10) to cause any end of lines in the range described by the variable to be converted to line feed characters (without the parameter they are represented by the null string). For example: PATTERN SUBSTITUTE - "(LINE_BEGIN + _n + LINE_END + _n + LINE_END)@_v1" - "STR(_v1, ASCII(10)) + STR(_v1, ASCII(10))" - ALL before substitution 123 456 after substitution 123 456 123 456 Carriage return characters adjacent to line feed characters in the replacement string are ignored. 3.6 Errors The search and replace strings are TPU expressions which have to be evaluated and may generate various TPU compilation / evaluation error messages. The following error messages are generated for invalid search or replace strings: Error in search pattern Error in replacement string These messages will normally be preceded by various TPU error messages. For example, the search string "'aaa' + bbb" would result in the following error messages: Undefined procedure call BBB Operand combination STRING + INTEGER unsupported Error in search pattern 3.7 Global variables Partial pattern assignment variables and pattern variables (such as _year in an earlier example) need to be global and must not clash with any TPU global variables used by LSE. This can be achieved by starting any such variable names with an underscore character. 3.8 Pattern variables Any complicated search or substitution is likely to need various pattern variables to have already been set up. This can be achieved in various ways. The definitions can be setup by issuing TPU commands, for example: TPU "_digits:='0123456789'" TPU "_digit:=any(_digits)" TPU "_year:=any(_digits,4)" TPU "_month:=any('01',1)+_digit" TPU "_day:=any('0123',1)+_digit" TPU "_n:=span(_digits)" The file LSE$PATTERNS.TPU in the LSE$EXAMPLE directory contains some examples of patterns which can be added to LSE by means of the following commands: OPEN FILE LSE$EXAMPLE:LSE$PATTERNS.TPU EXTEND * TPU "LSE$PATTERNS_MODULE_INIT" 3.9 Use for developing DTM user filters The user defined filters global replace feature introduced in Compaq Digital Test Manager 4.0 can be simulated using the PATTERN SUBSTITUTE command. This allows DTM user defined filters to be developed interactively using LSE. For example, to replace any numbers at the end of lines with the string 'xxx': global_replace( _n + LINE_END, 'xxx' + ASCII(10), NO_EXACT, OFF, ON); The LSE equivalent (assuming that the current search attributes are equivalent to NO_EXACT) is: PATTERN SUBSTITUTE - "_n + LINE_END" - "'xxx' + ASCII(1O)" - ALL The LSE equivalent of the pattern to replace parameter (first parameter of the global_replace routine) is the same except that the parameter has to be in quotes. The LSE equivalent of the replacement string parameter (second parameter) is the same if the evaluate replacement parameter (fourth parameter) is set to ON and is the same except that the parameter has to be in quotes if the evaluate replacement parameter is set to OFF. The LSE equivalent of the search mode parameter (third parameter) is the setting of the search options (set by the SET SEARCH command). LSE does not have equivalents of the evaluate replacement parameter (fourth parameter) or the convert linefeeds parameter (fifth parameter). It always evaluates the replacement string parameter and it always converts linefeed characters (and ignores adjacent carriage return characters).
|