书接上回,我们接着讲如何找到匹配结果中的字串
实例5:从匹配的字串中标记指定的子表达式
#define _SCL_SECURE_NO_WARNINGS // 去除vs编译警告 #include <iostream> #include <boost/xpressive/xpressive.hpp> using namespace boost::xpressive; int main() { std::string str( "Eric: 4:40, Karl: 3:35, Francesca: 2:32" ); // find a race time sregex time = sregex::compile( "(\\d):(\\d\\d)" ); // for each match, the token iterator should first take the value of // the first marked sub-expression followed by the value of the second // marked sub-expression int const subs[] = { 1, 2 }; sregex_token_iterator cur( str.begin(), str.end(), time, subs ); sregex_token_iterator end; for( ; cur != end; ++cur ) { std::cout << *cur << '\n'; } /* result: 4 40 3 35 2 32 */ // 另一种实现,实例4中类似的方法实现之 sregex_iterator curI( str.begin(), str.end(), time ); sregex_iterator endI; for( ; curI != endI; ++curI ) { std::cout << (*curI)[1] << ":" << (*curI)[2] << '\n'; } return 0; } /* result: 4:40 3:35 2:32 */
实例6:token_iterator的特殊应用
#define _SCL_SECURE_NO_WARNINGS // 去除vs编译警告 #include <iostream> #include <boost/xpressive/xpressive.hpp> using namespace boost::xpressive; int main() { std::string str( "Now <bold>is the time <i>for all good men</i> to come to the aid of their</bold> country." ); // find a HTML tag //sregex html = '<' >> optional('/') >> +_w >> '>'; sregex html = sregex::compile("</?(\\w*)>"); // -1, 是一个特殊的token数组标志,表示所有不能匹配的字串 sregex_token_iterator cur( str.begin(), str.end(), html, -1 ); sregex_token_iterator end; for( ; cur != end; ++cur ) { std::cout << '{' << *cur << '}'; } std::cout << '\n'; // result:{Now }{is the time }{for all good men}{ to come to the aid of their}{ country.} // 0, 是一个特殊的token数组标志,表示所有能匹配的字串 sregex_token_iterator curI( str.begin(), str.end(), html, 0); for( ; curI != end; ++curI ) { std::cout << '{' << *curI << '}'; } std::cout << '\n'; // result: {<bold>}{<i>}{</i>}{</bold>} // 1为元素的数组, 是一个特殊的token数组标志,表示所有能匹配的字串内部的第1个子串 const int sub[] = {1}; sregex_token_iterator cur2( str.begin(), str.end(), html, sub); for( ; cur2 != end; ++cur2 ) { std::cout << '{' << *cur2 << '}'; } std::cout << '\n'; // result:{bold}{i}{i}{bold} return 0; }
对应的基本表达式表
Perl |
Static xpressive |
Meaning |
---|---|---|
|
any character (assuming Perl's /s modifier). |
|
|
|
sequencing of |
|
|
alternation of |
|
|
group and capture a back-reference. |
|
|
group and do not capture a back-reference. |
|
a previously captured back-reference. |
|
|
|
zero or more times, greedy. |
|
|
one or more times, greedy. |
|
|
zero or one time, greedy. |
|
|
between |
|
|
zero or more times, non-greedy. |
|
|
one or more times, non-greedy. |
|
|
zero or one time, non-greedy. |
|
|
between |
|
beginning of sequence assertion. |
|
|
end of sequence assertion. |
|
|
word boundary assertion. |
|
|
|
not word boundary assertion. |
|
literal newline. |
|
|
|
any character except a literal newline (without Perl's /s modifier). |
|
logical newline. |
|
|
|
any single character not a logical newline. |
|
a word character, equivalent to set[alnum | '_']. |
|
|
|
not a word character, equivalent to ~set[alnum | '_']. |
|
a digit character. |
|
|
|
not a digit character. |
|
a space character. |
|
|
|
not a space character. |
|
an alpha-numeric character. |
|
|
an alphabetic character. |
|
|
a horizontal white-space character. |
|
|
a control character. |
|
|
a digit character. |
|
|
a graphable character. |
|
|
a lower-case character. |
|
|
a printing character. |
|
|
a punctuation character. |
|
|
a white-space character. |
|
|
an upper-case character. |
|
|
a hexadecimal digit character. |
|
|
|
characters in range |
|
|
characters |
|
|
same as above |
|
characters |
|
|
same as above |
|
|
|
not characters |
|
|
match stuff disregarding case. |
|
|
independent sub-expression, match stuff and turn off backtracking. |
|
|
positive look-ahead assertion, match if before stuff but don't include |
|
|
negative look-ahead assertion, match if not before stuff. |
|
|
positive look-behind assertion, match if after stuff but don't include |
|
|
negative look-behind assertion, match if not after stuff. (stuff must be constant-width.) |
|
|
Create a named capture. |
|
|
Refer back to a previously created named capture. |