现在的位置: 首页 > 综合 > 正文

正则表达式语法汇总–类Unix、UltraEdit、MS VC++ 6.0及VS.NET

2013年08月16日 ⁄ 综合 ⁄ 共 6209字 ⁄ 字号 评论关闭

 

正则表达式语法汇总
 
正则表达式作为功能强大的文本模式匹配语言应用非常广泛,除类Unix系统所使用的标准正则表达式外,像UltraEdit、MS VC++ 6.0编辑器、VS.NET编辑器等也会遇到。但是他们的语法是有差别的,下面就将这几类正则表达式的语法罗列出来以供在必要时查阅
 
一、标准正则表达式
这里所说的标准正则表达式是指类Unix系统所使用的正则表达式,其语法如下:
Regular Expressions (Unix Syntax):
 

Symbol
Function
/
Indicates the next character has a special meaning. "n" on it own matches the character "n". "/n" matches a linefeed or newline character.  See examples below (/d, /f, /n etc).
^
Matches/anchors the beginning of line.
$
Matches/anchors the end of line.
*
Matches the preceding character zero or more times.
+
Matches the preceding character one or more times. Does not match repeated newlines.
Matches any single character except a newline character. Does not match repeated newlines.
(expression)
Brackets or tags an expression to use in the replace command.A regular expression may have up to 9 tagged expressions, numbered according to their order in the regular expression.
 
The corresponding replacement expression is /x, for x in the range 1-9.  Example: If (h.*o) (f.*s) matches "hello folks", /2 /1 would replace it with "folks hello".
[xyz]
A character set. Matches any characters between brackets.
[^xyz]
A negative character set. Matches any characters NOT between brackets.
/d
Matches a digit character. Equivalent to [0-9].
/D
Matches a nondigit character. Equivalent to [^0-9].
/f
Matches a form-feed character.
/n
Matches a linefeed character.
/r
Matches a carriage return character.
/s
Matches any whitespace including space, tab, form-feed, etc but not newline.
/S
Matches any non-whitespace character but not newline.
/t
Matches a tab character.
/v
Matches a vertical tab character.
/w
Matches any word character including underscore.
/W
Matches any nonword character.
/p
Matches CR/LF (same as /r/n) to match a DOS line terminator
 
二、UltraEdit风格的正则表达式
Regular Expressions (UltraEdit Syntax):
 

Symbol
Function
%
Matches the start of line - Indicates the search string must be at the beginning of a line but does not include any line terminator characters in the resulting string selected.
$
Matches the end of line - Indicates the search string must be at the end of line but does not include any line terminator characters in the resulting string selected.
?
Matches any single character except newline.
*
Matches any number of occurrences of any character except newline.
+
Matches one or more of the preceding character/expression.  At least one occurrence of the character must be found.  Does not match repeated newlines.
++
Matches the preceding character/expression zero or more times.  Does not match repeated newlines.
^b
Matches a page break.
^p
Matches a newline (CR/LF) (paragraph) (DOS Files)
^r
Matches a newline (CR Only) (paragraph) (MAC Files)
^n
Matches a newline (LF Only) (paragraph) (UNIX Files)
^t
Matches a tab character
[ ]
Matches any single character or range in the brackets
^{A^}^{B^}
Matches expression A OR B
^
Overrides the following regular expression character
^(sub-regex)  
Brackets or tags an expression to use in the replace command.  A regular expression may have up to 9 tagged expressions, numbered according to their order in the regular expression.
 
The corresponding replacement expression is ^x, for x in the range 1-9.  Example: If ^(h*o^) ^(f*s^) matches "hello folks", ^2 ^1 would replace it with "folks hello".
 
 
三、MS VC++ 6.0编辑器风格的正则表达式
在使用MS VC++ 6.0编辑代码时,我们常常会在代码中“查找/替换”,这时只需勾选“正则表达式”选项就可以在查找替换时使用功能强大的正则表达式。下面是在此处使用正则表达式相应的语法规则:

Regular Expression
Description
.
(Period.) Any single character.
[ ]
Any one of the characters contained in the brackets, or any of an ASCII range of characters separated by a hyphen (-). For example, b[aeiou]d matches bad, bed, bid, bod, and bud, and r[eo]+d matches red, rod, reed, and rood, but not reod or roed. x[0-9] matches x0, x1, x2, and so on. If the first character in the brackets is a caret (^), then the regular expression matches any characters except those in the brackets.
^
The beginning of a line.
$
The end of a line.
/( /)
Indicates a tagged expression to retain for replacement purposes. If the expression in the Find What text box is /(lpsz/)BigPointer, and the expression in the Replace With box is /1NewPointer, all selected occurrences of lpszBigPointer are replaced with lpszNewPointer. Each occurrence of a tagged expression is numbered according to its order in the Find What text box, and its replacement expression is /n, where 1 corresponds to the first tagged expression, 2 to the second, and so on. You can have up to nine tagged expressions.
/~
No match if the following character or characters occur. For example, b/~a+d matches bbd, bcd, bdd, and so on, but not bad.
You can use this expression to prefix a group of characters you want to exclude, which is useful for excluding matches of particular words. For example, foo/~/(lish/) matches "foo" in "food" and "afoot" but not in "foolish."
/{c/!c/}
Any one of the characters separated by the alternation symbol (/!). For example, /{j/!u/}+fruit finds jfruit, jjfruit, ufruit, ujfruit, uufruit, and so on.
*
None or more of the preceding characters or expressions. For example, ba*c matches bc, bac, baac, baaac, and so on.
+
At least one or more of the preceding characters or expressions. For example, ba+c matches bac, baac, baaac, but not bc.
/{/}
Any sequence of characters between the escaped braces. For example, /{ju/}+fruit finds jufruit, jujufruit, jujujufruit, and so on. Note that it will not find jfruit, ufruit, or ujfruit, because the sequence ju is not in any of those strings.
[^]
Any character except those following the caret (^) character in the brackets, or any of an ASCII range of characters separated by a hyphen (-). For example, x[^0-9] matches xa, xb, xc, and so on, but not x0, x1, x2, and so on.
/:a
Any single alphanumeric character [a – zA – Z0 – 9].
/:b
Any white-space character. The /:b finds tabs and spaces. There is no alternate syntax to express :b.
/:c
Any single alphabetic character [a – zA – Z].
/:d
Any decimal digit [0 – 9].
/:n
Any unsigned number /{[0-9]+/.[0-9]*/![0-9]*/.[0-9]+/![0-9]+/}. For example, /:n should match 123, .45, and 123.45.
/:z
Any unsigned decimal integer [0 – 9]+.
/:h
Any hexadecimal number [0 – 9a – fA – F]+.
/:i
Any C/C++ identifier [a – zA – Z_$][a – zA – Z0 – 9_$]+.
/:w
Any alphabetic string [a – zA – Z]+. The string need not be bounded by white space or appear at the beginning or the end of a line.
/:q
Any quoted string /{"[^"]*"/!'[^']*'/}.
/
Removes the pattern match characteristic in the Find What text box from the special characters listed above. For example, 100$ matches 100 at the end of a line, but 100/$ matches the character string 100$ anywhere on a line.
 
 
 
 
四、VS.NET 2005编辑器风格的正则表达式
VS.NET 2005编辑器所使用的正则表达式是MS VC++ 6.0编辑器所使用正则表达式的超集:

Expression
Syntax
Description
Any character
.
Matches any one character except a line break.
Maximal — zero or more
*
Matches zero or more occurrences of the preceding expression.
Maximal — one or more
+
Matches at least one occurrence of the preceding expression.
Minimal — zero or more
@
Matches zero or more occurrences of the preceding expression, matching as few characters as possible.
Minimal — one or more
#
Matches one or more occurrences of the preceding expression, matching as few characters as possible.

抱歉!评论已关闭.