Python简单爬虫，爬取百度贴吧

返回顶部
查看留言
转到底部

现在的位置: 首页 > 综合 > 正文

Python简单爬虫，爬取百度贴吧

2014年09月05日 ⁄ 综合 ⁄ 共 484字 ⁄ 字号小中大 ⁄ 评论关闭

# -*- coding: utf-8 -*-

import urllib2, string                                       

#导入所需要的模块

def spider(s, e, url):
	for i in range(s, e + 1):
		file_name = string.zfill(i, 5) + '.html'
		print u"蜘蛛正在下载 %s ......." % file_name
		with open(road + file_name, 'w+') as f:
			html = urllib2.urlopen(url + str(i))
			f.write(html.read())
#定义爬取函数

url = raw_input(u'请输入要爬虫的贴吧网址(去掉网址内pn后面的数字):')
start = raw_input(u'请输入要爬虫的起始页数')
end = raw_input(u'请输入要爬虫的终止页数')
road = r'F:\\recover\\2014.3-2014.6\\python pachong\\download data\\'
start_page = int(start)
end_page = int(end)

spider(start_page, end_page, url)

【上篇】hdu4405概率dp入门
【下篇】constexpr和常量表达式

作者: penwopqcndkylupqd

该日志由 penwopqcndkylupqd 于10年前发表在综合分类下，最后更新于 2014年09月05日.
转载请注明: Python简单爬虫，爬取百度贴吧 | 学步园 +复制链接

抱歉!评论已关闭.

返回首页

（其他合作也可洽谈）

必威体育

必威电竞

学步园

Python简单爬虫，爬取百度贴吧

作者: penwopqcndkylupqd

书签

最新文章New

本站推荐

返回首页