原题目如下:
编写一段程序,解析url
url是如下格式的字符串:
schema+“://”+ authority + host + port+ fullpath
其中authority、port、fullpath
3项都是可选 的,在url中可以全部出现,出现一部分或者全部不出现
Schema为英文字母(大写、小写)组成的串。典型的如:http,ftp,mms.
authority格式:
user + ":" + passwd + “@”
或者 user + "@"
或者 “:” + "@"
或者 “:” + passwd + “@”
或者 "@"
host 为英文字母(大写、小写)和10个阿拉伯数字组成的串
port格式:
":" + 数字串
fullpath格式和unix下的全路径格式一样 如/user/1.mmp3,2.mp3
现有一个文本文件1.txt,每行是一个url 串,请编写程序,把这个文本文件中所有的url的要schema、user,passwd,host,port,fullpath解析也来,结果放在2.txt中,每行是一个usrl解析结果,每行6列,分析是schema、user,passwd,host,port,fullpath项的值,要求排列整齐。如果没有相应的值则用串"default代替"
实现的思路:
1.首先打开文件1.txt,读取一行数据
2.处理一行数据,进行解析
3.输出追加到2.txt
这里我只做了部分功能,即只做了第2个部分的,其中1.txt的数据如下:
http://sss@ewr1234567890:78/user/1.mp3,2.mp3
http://sss@ewr1234567890:78/user/1.mp3,2.mp3
http://sss@ewr1234567890:78/user/1.mp3,2.mp3
参考另外一篇网页写的
http://blog.csdn.net/is2120/article/details/6251412
具体代码如下:
#include <stdio.h> #include <string.h> #include <stdlib.h> int parse_url(char *url,char **Schema_p, char **Authority_p, char **Host_p, int *portp, char **Fullpath_p) { char buf[256]; int serverlen, numread=0; //获取http...ftp...mms... sscanf(url, "%255[^://]", buf); serverlen = strlen(buf); *Schema_p = (char *)malloc(serverlen+1); strcpy(*Schema_p, buf); //去掉Schema头 url = url+serverlen+3; //进行Authority匹配 sscanf(url, "%255[^@]", buf); serverlen = strlen(buf); *Authority_p = (char *)malloc(serverlen); strcpy(*Authority_p, buf); //去掉Authority部分 url = url+serverlen+1; //进行Host匹配 sscanf(url, "%255[^:]", buf); serverlen = strlen(buf); *Host_p = (char *)malloc(serverlen+1); strcpy(*Host_p, buf); //进行port查找 if(url[serverlen]==':') { sscanf(&url[serverlen+1], "%d%n", portp, &numread); /* add one to go PAST it */ numread++; } else { *portp = 80; } /* the path is a pointer into the rest of url */ //获取最终路径 *Fullpath_p = &url[serverlen+numread]; return 0; } int main() { char url[256] = "http://sss@ewr1234567890:78/user/1.mp3,2.mp3"; char url_1[256] = "ftp://sss@ewr1234567890:78/user/1.mp3,2.mp3"; char Schema_str[256] = {'\0'}; char Authority_str[256] = {'\0'}; char Host_str[256] = {'\0'}; char Fullpath_str[256] = {'\0'}; char* Schema_p = Schema_str; char* Authority_p = Authority_str; char* Host_p = Host_str; char* Fullpath_p = Fullpath_str; int port; parse_url(url,&Schema_p,&Authority_p,&Host_p,&port,&Fullpath_p); printf("%s\n%s\n%s\n%s\n%d\n%s\n",url,Schema_p,Authority_p,Host_p,port,Fullpath_p); printf("\n\n"); parse_url(url_1,&Schema_p,&Authority_p,&Host_p,&port,&Fullpath_p); printf("%s\n%s\n%s\n%s\n%d\n%s\n",url,Schema_p,Authority_p,Host_p,port,Fullpath_p); return 0; }
如果要实现思路中1和3,只需要将前面定义的url和url_1变成是从1.txt读取的即可,并在最后的printf处改为输出到2.txt并保持格式,这里不再写实现代码。