regexp9.7 (2227B)
1 .deEX 2 .ift .ft5 3 .nf 4 .. 5 .deEE 6 .ft1 7 .fi 8 .. 9 .TH REGEXP9 7 10 .SH NAME 11 regexp \- Plan 9 regular expression notation 12 .SH DESCRIPTION 13 This manual page describes the regular expression 14 syntax used by the Plan 9 regular expression library 15 .IR regexp9 (3). 16 It is the form used by 17 .IR egrep (1) 18 before 19 .I egrep 20 got complicated. 21 .PP 22 A 23 .I "regular expression" 24 specifies 25 a set of strings of characters. 26 A member of this set of strings is said to be 27 .I matched 28 by the regular expression. In many applications 29 a delimiter character, commonly 30 .LR / , 31 bounds a regular expression. 32 In the following specification for regular expressions 33 the word `character' means any character (rune) but newline. 34 .PP 35 The syntax for a regular expression 36 .B e0 37 is 38 .IP 39 .EX 40 e3: literal | charclass | '.' | '^' | '$' | '(' e0 ')' 41 42 e2: e3 43 | e2 REP 44 45 REP: '*' | '+' | '?' 46 47 e1: e2 48 | e1 e2 49 50 e0: e1 51 | e0 '|' e1 52 .EE 53 .PP 54 A 55 .B literal 56 is any non-metacharacter, or a metacharacter 57 (one of 58 .BR .*+?[]()|\e^$ ), 59 or the delimiter 60 preceded by 61 .LR \e . 62 .PP 63 A 64 .B charclass 65 is a nonempty string 66 .I s 67 bracketed 68 .BI [ \|s\| ] 69 (or 70 .BI [^ s\| ]\fR); 71 it matches any character in (or not in) 72 .IR s . 73 A negated character class never 74 matches newline. 75 A substring 76 .IB a - b\f1, 77 with 78 .I a 79 and 80 .I b 81 in ascending 82 order, stands for the inclusive 83 range of 84 characters between 85 .I a 86 and 87 .IR b . 88 In 89 .IR s , 90 the metacharacters 91 .LR - , 92 .LR ] , 93 an initial 94 .LR ^ , 95 and the regular expression delimiter 96 must be preceded by a 97 .LR \e ; 98 other metacharacters 99 have no special meaning and 100 may appear unescaped. 101 .PP 102 A 103 .L . 104 matches any character. 105 .PP 106 A 107 .L ^ 108 matches the beginning of a line; 109 .L $ 110 matches the end of the line. 111 .PP 112 The 113 .B REP 114 operators match zero or more 115 .RB ( * ), 116 one or more 117 .RB ( + ), 118 zero or one 119 .RB ( ? ), 120 instances respectively of the preceding regular expression 121 .BR e2 . 122 .PP 123 A concatenated regular expression, 124 .BR "e1\|e2" , 125 matches a match to 126 .B e1 127 followed by a match to 128 .BR e2 . 129 .PP 130 An alternative regular expression, 131 .BR "e0\||\|e1" , 132 matches either a match to 133 .B e0 134 or a match to 135 .BR e1 . 136 .PP 137 A match to any part of a regular expression 138 extends as far as possible without preventing 139 a match to the remainder of the regular expression. 140 .SH "SEE ALSO 141 .IR regexp9 (3)