读文件的常用方法及对比

现在的位置: 首页 > 综合 > 正文

读文件的常用方法及对比

2011年05月02日 ⁄ 综合 ⁄ 共 3138字 ⁄ 字号小中大 ⁄ 评论关闭

不带格式的输入，将输入流直接按字节读取

测试文件均为24M大小的一个英文文件。

C语言，fread

#include <stdio.h>
int main () {
char c[1];
int num;
FILE*   fp   =   fopen("http://www.cnblogs.com/big.log",   "rb");
while(1) {
num = fread(c, 1, 1, fp);  //每次只读一个char，第一个1表示一次读1个byte，第二个1表示连续读1次
//printf("%c",c[0]);
if (num == 0)
break;
}
return 0;
}

运行时间

real    0m1.349s
user    0m1.028s
sys    0m0.316s

改为 char c[1024]，即每次读1024个bytes作为一组,读1组，减少fread调用次数。

num = fread(c, 1024, 1, fp)

real    0m0.041s
user    0m0.004s
sys    0m0.036s

注意返回值是成功读出的组数，而不是读到的字节数目

假如文件中的字节数目不足1024，那么上面的fread则返回0.

按照huffman程序的要求，应该是一个byte一组，一次多读一些组

fread(c, 1, 1024, fp) 这样返回值相当与读到的字节数目。

NAME

fread, fwrite - binary stream input/output

SYNOPSIS

#include <stdio.h>

size_t fread(void *ptr, size_t size, size_t nmemb, FILE *stream);

size_t fwrite(const void *ptr, size_t size, size_t nmemb,

FILE *stream);

The function fread() reads nmemb elements of data, each size bytes

long, from the stream pointed to by stream, storing them at the loca-

tion given by ptr.

python, read()

file = open('http://www.cnblogs.com/big.log', 'rb')
while 1:
c = file.read(1)
if c =='':
break
#print c,

运行时间

real 0m15.013s
user 0m13.505s

Python慢在函数调用花费时间相比C太多。

C++ istreambuf_iterator

#include <iostream>
#include <iterator>
#include <string>
#include <fstream>
using namespace std;
int main () {
ifstream input_file("http://www.cnblogs.com/big.log",ios::binary);
char c;
//---istreambuf_iterator read,will read directly from the input stream 
istreambuf_iterator<char> eos;               // end-of-range iterator
istreambuf_iterator<char> iit (input_file);
while (iit!=eos) c =*iit++;
return 0;
}

real    0m3.414s
user    0m2.528s
sys    0m0.884s

C++ istream_iterator

#include <iostream>
#include <iterator>
#include <string>
#include <fstream>
//Show C++ read a file by bytes
//Show C read a file by bytes
//Show system c read
//==================================================================
using namespace std;
int main () {
ifstream input_file("http://www.cnblogs.com/big.log",ios::binary);
input_file.unsetf(ios::skipws); // 要接受空格符
char c;
//---istreambuf_iterator read,will read directly from the input stream 
istream_iterator<char> eos;               // end-of-range iterator
istream_iterator<char> iit (input_file);
while (iit!=eos) {
c =*iit++;
//cout << c;
}
return 0;
}

real    0m2.433s
user    0m1.732s
sys    0m0.704s

以上都是gcc 4.2.4的结果，奇怪按照effective stl 29条的说法，用istream_iterator使用operator >>而istreambuf_iterator使直接从流缓冲区读，所以istreambuf_iterator会快很多。

但是我实验的结果反而它更慢啊。

如果强调文件读写的速度还是直接用C吧。

//C++ cin.get(ch),返回值是应用的istream对象

int main () {
ifstream input_file("http://www.cnblogs.com/big.log",ios::binary);
char c[1024];
//input_file.get(c,1024);
while(input_file.get(c[0]));
//cout << c[0];
return 0;
}

real    0m2.300s
user    0m0.472s
sys    0m1.700s

//C++ cin.get()，返回值是一个int,-1表示结束 EOF

 while((c[0] = input_file.get()) != EOF)
cout << c[0];

real    0m0.885s
user    0m0.524s
sys    0m0.364s

//注意用get(char *, streamsize, delemeter = ‘\n’)的时候，读2个字节的话会第二个字节是\0，input_file.gcount显示1。

所以要想用这个函数一次读一个字符，要用2而不是1.

input_file.get(c,2, EOF);

//System C read()

#include <fcntl.h>
int main () {
//char   buf[2048];
char c[1024];
int num;
std::string s = "http://www.cnblogs.com/big.log";
//std::string s = "http://www.cnblogs.com/test.log";
//FILE*   fp   =   fopen(s.c_str(),   "rb");   
int fp = open(s.c_str(), O_RDONLY);
while(1) {
//num = fread(c, 1024, 1, fp);
num = read(fp, c, 1);
//printf("%c",c[0]);
if (num == 0)
//if (c[0] == EOF)
break;
}
return 0;

real    0m43.560s
user    0m4.908s
sys    0m38.594s

【上篇】每个睡醒后的早晨都当成一件礼物
【下篇】Apple Mac OS X每日一技巧018:Finder中如何查看完整路径

作者: hjrnh

该日志由 hjrnh 于13年前发表在综合分类下，最后更新于 2011年05月02日.
转载请注明: 读文件的常用方法及对比 | 学步园 +复制链接

抱歉!评论已关闭.

学步园