现在的位置: 首页 > 综合 > 正文

详细解说STL string（一）

2013年03月08日 ⁄ 综合 ⁄ 共 7342字 ⁄ 字号小中大 ⁄ 评论关闭

0 前言: string 的角色

C++ 语言是个十分优秀的语言，但优秀并不表示完美。还是有许多人不愿意使用C或者C++，为什么？原因众多，其中之一就是C/C++的文本处理功能太麻烦，用起来很不方便。以前没有接触过其他语言时，每当别人这么说，我总是不屑一顾，认为他们根本就没有领会C++的精华，或者不太懂C++，现在我接触perl, php, 和Shell脚本以后，开始理解了以前为什么有人说C++文本处理不方便了。

举例来说，如果文本格式是：用户名电话号码，文件名name.txt

Tom 23245332Jenny 22231231Heny 22183942Tom 23245332...

现在我们需要对用户名排序，且只输出不同的姓名。

那么在shell 编程中，可以这样用： string来代替char * 数组，使用sort排序算法来排序，用unique 函数来去重。听起来好像很不错 smile 。看看下面代码(例程1）：

awk '{print $1}' name.txt | sort | uniq

简单吧？

如果使用C/C++ 就麻烦了，他需要做以下工作：

先打开文件，检测文件是否打开，如果失败，则退出。
声明一个足够大得二维字符数组或者一个字符指针数组
读入一行到字符空间
然后分析一行的结构，找到空格，存入字符数组中。
关闭文件
写一个排序函数，或者使用写一个比较函数，使用qsort排序
遍历数组，比较是否有相同的，如果有，则要删除，copy...
输出信息

你可以用C++或者C语言去实现这个流程。如果一个人的主要工作就是处理这种类似的文本(例如做apache的日志统计和分析),你说他会喜欢C/C++么？

当然，有了STL，这些处理会得到很大的简化。我们可以使用 fstream来代替麻烦的fopen fread fclose, 用vector 来代替数组。最重要的是用string类代替char * 数组。

#include <string>

#include <iostream>

#include <algorithm>

#include <vector>

#include <fstream>

using namespace std;

int main()

{

ifstream in("name.txt");

string strtmp;

vector<string> vect;

while(getline(in, strtmp, ' '))

vect.push_back(strtmp.substr(0, strtmp.find(' ')));

sort(vect.begin(), vect.end());

vector<string>::iterator it=unique(vect.begin(), vect.end());

copy(vect.begin(), it, ostream_iterator<string>(cout, " "));

return 0;

}

也还不错吧，至少会比想象得要简单得多！（代码里面没有对错误进行处理，只是为了说明问题，不要效仿).

当然，在这个文本格式中，不用vector而使用map会更有扩充性，例如，还可通过人名找电话号码等等，但是使用了map就不那么好用sort了。你可以用map试一试。

这里string的作用不只是可以存储字符串，还可以提供字符串的比较，查找等。在sort和unique函数中就默认使用了less 和equal_to函数, 上面的一段代码，其实使用了string的以下功能：

存储功能，在getline() 函数中
查找功能，在find() 函数中
子串功能，在substr() 函数中
string operator < , 默认在sort() 函数中调用
string operator == , 默认在unique() 函数中调用

总之，有了string 后，C++的字符文本处理功能总算得到了一定补充，加上配合STL其他容器使用，其在文本处理上的功能已经与perl, shell, php的距离缩小很多了。因此掌握string 会让你的工作事半功倍。

1 string 使用

其实，string并不是一个单独的容器，只是basic_string 模板类的一个typedef 而已，相对应的还有wstring, 你在string 头文件中你会发现下面的代码:

extern "C++"

{

typedef basic_string <char> string;

typedef basic_string <wchar_t> wstring;

}// extern "C++"

由于只是解释string的用法，如果没有特殊的说明，本文并不区分string 和 basic_string的区别。

string 其实相当于一个保存字符的序列容器，因此除了有字符串的一些常用操作以外，还有包含了所有的序列容器的操作。字符串的常用操作包括：增加、删除、修改、查找比较、链接、输入、输出等。详细函数列表参看附录。不要害怕这么多函数，其实有许多是序列容器带有的，平时不一定用的上。

如果你要想了解所有函数的详细用法，你需要查看basic_string，或者下载STL编程手册。这里通过实例介绍一些常用函数。

1.1 充分使用string 操作符

string 重载了许多操作符，包括 +, +=, <, =, , [], <<, >>等，正是这些操作符，对字符串操作非常方便。先看看下面这个例子：tt.cpp（例程2）

#include <string>

#include <iostream>

using namespace std;

int main()

{

string strinfo="Please input your name:";

cout << strinfo ;

cin >> strinfo;

if( strinfo == "winter" )

cout << "you are winter!"<<endl;

else if( strinfo != "wende" )

cout << "you are not wende!"<<endl;

else if( strinfo < "winter")

cout << "your name should be ahead of winter"<<endl;

else

cout << "your name should be after of winter"<<endl;

strinfo += " , Welcome to China!";

cout << strinfo<<endl;

cout <<"Your name is :"<<endl;

string strtmp = "How are you? " + strinfo;

for(int i = 0 ; i < strtmp.size(); i ++)

cout<<strtmp[i];

return 0;

}

下面是程序的输出

-bash-2.05b$ make ttc++ -O -pipe -march=pentiumpro tt.cpp -o tt-bash-2.05b$ ./tt

Please input your name:Hero

you are not wende!

Hero , Welcome to China!

How are you?

Hero , Welcome to China!

有了这些操作符，在STL中仿函数都可以直接使用string作为参数，例如 less, great, equal_to 等，因此在把string作为参数传递的时候，它的使用和int 或者float等已经没有什么区别了。例如，你可以使用：

map<string, int> mymap;//以上默认使用了 less<string>

有了 operator + 以后，你可以直接连加，例如：

string strinfo="Winter";
string strlast="Hello " + strinfo + "!";
//你还可以这样：
string strtest="Hello " + strinfo + " Welcome" + " to China" + " !";

看见其中的特点了吗？只要你的等式里面有一个 string 对象，你就可以一直连续"+"，但有一点需要保证的是，在开始的两项中，必须有一项是 string 对象。其原理很简单：

系统遇到"+"号，发现有一项是string 对象。
系统把另一项转化为一个临时 string 对象。
执行 operator + 操作，返回新的临时string 对象。
如果又发现"+"号，继续第一步操作。

由于这个等式是由左到右开始检测执行，如果开始两项都是const char* ，程序自己并没有定义两个const char* 的加法，编译的时候肯定就有问题了。

有了操作符以后，assign(), append(), compare(), at()等函数，除非有一些特殊的需求时，一般是用不上。当然at()函数还有一个功能，那就是检查下标是否合法，如果是使用：

string str="winter";

//下面一行有可能会引起程序中断错误str[100]='!';

//下面会抛出异常:

throws: out_of_rangecout<<str.at(100)<<endl;

了解了吗？如果你希望效率高，还是使用[]来访问，如果你希望稳定性好，最好使用at()来访问。

1.2 眼花缭乱的string find 函数

由于查找是使用最为频繁的功能之一，string 提供了非常丰富的查找函数。其列表如下：

函数名	描述
find	查找
rfind	反向查找
find_first_of	查找包含子串中的任何字符，返回第一个位置
find_first_not_of	查找不包含子串中的任何字符，返回第一个位置
find_last_of	查找包含子串中的任何字符，返回最后一个位置
find_last_not_of	查找不包含子串中的任何字符，返回最后一个位置

以上函数都是被重载了4次，以下是以find_first_of 函数为例说明他们的参数，其他函数和其参数一样，也就是说总共有24个函数 smile ：

size_type find_first_of(const basic_string& s, size_type pos = 0)

size_type find_first_of(const charT* s, size_type pos, size_type n)

size_type find_first_of(const charT* s, size_type pos = 0)

size_type find_first_of(charT c, size_type pos = 0)

所有的查找函数都返回一个size_type类型，这个返回值一般都是所找到字符串的位置，如果没有找到，则返回string::npos。有一点需要特别注意，所有和string::npos的比较一定要用string::size_type来使用，不要直接使用int 或者unsigned int等类型。其实string::npos表示的是-1, 看看头文件：

template <class _CharT, class _Traits, class _Alloc> const

basic_string<_CharT,_Traits,_Alloc>::size_type basic_string<_CharT,_Traits,_Alloc>::npos

= basic_string<_CharT,_Traits,_Alloc>::size_type) -1;

find 和 rfind 都还比较容易理解，一个是正向匹配，一个是逆向匹配，后面的参数pos都是用来指定起始查找位置。对于find_first_of 和find_last_of 就不是那么好理解。

find_first_of 是给定一个要查找的字符集，找到这个字符集中任何一个字符所在字符串中第一个位置。或许看一个例子更容易明白。

有这样一个需求：过滤一行开头和结尾的所有非英文字符。看看用string 如何实现：

#include <string>

#include <iostream>

using namespace std;

int main()

{

string strinfo=" //*---Hello Word!......------";

string strset="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";

int first = strinfo.find_first_of(strset);

if(first == string::npos)

{

cout<<"not find any characters"<<endl;

return -1;

}

int last = strinfo.find_last_of(strset);

if(last == string::npos)

{

cout<<"not find any characters"<<endl;

return -1;

}

cout << strinfo.substr(first, last - first + 1)<<endl;

return 0;

}

这里把所有的英文字母大小写作为了需要查找的字符集，先查找第一个英文字母的位置，然后查找最后一个英文字母的位置，然后用substr 来的到中间的一部分，用于输出结果。下面就是其结果：

Hello Word

前面的符号和后面的符号都没有了。像这种用法可以用来查找分隔符，从而把一个连续的字符串分割成为几部分，达到 shell 命令中的 awk 的用法。特别是当分隔符有多个的时候，可以一次指定。例如有这样的需求：

张三|3456123, 湖南李四,4564234| 湖北王小二, 4433253|北京...

我们需要以 "|" ","为分隔符，同时又要过滤空格，把每行分成相应的字段。可以作为你的一个家庭作业来试试，要求代码简洁。 
一些例子:
 

// main.cpp
// compile with: /EHsc//
// Functions://
// string::find_first_of() - find the first instance in the
//         controlled string of any of the elements specified by the
//         parameters. The search begins at an optionally-supplied
//         position in the controlled string.

#include <string>
#include <iostream>
using namespace std ;

int main()
...{    
    string str1("Heartbeat");    
    string str2("abcde");    
    size_t iPos = 0;    
    cout << "The string to search is '"
              << str1.c_str() << "'"         
              << endl;    
    // find the first instance in str1 of any characters in str2    
    iPos = str1.find_first_of (str2, 0);  
    // 0 is default position    
    cout << "Element in '" << str2.c_str() << "' found at position "         
             << iPos << endl;    
    // start looking in the third position...    i
    Pos = str1.find_first_of (str2, 2);    
    cout << "Element in '" << str2.c_str() << "' found at position "         
             << iPos << endl;    
    // use an array of the element type as the set of elements to    
  // search for; look for anything after the fourth position    
     char achVowels[] = ...{'a', 'e', 'i', 'o', 'u'};    
    iPos = str1.find_first_of (achVowels, 4, sizeof(achVowels));    
    cout << "Element in '";    
    for (int i = 0; i < sizeof (achVowels); i++)        
        cout << achVowels[i];    
        cout