现在的位置: 首页 > 综合 > 正文

Linux抓取批量下载地址

2013年10月02日 ⁄ 综合 ⁄ 共 3815字 ⁄ 字号 评论关闭

某视频网站在线播放列表如下图所示:

查看源代码:

<div class="fj1"><span>第1集</span><a href="/eschool/video/autohtml/310/261/0.shtml" target="_blank">1 C++简介</a></div>
<div class="fj1"><span>第2集</span><a href="/eschool/video/autohtml/310/261/1.shtml" target="_blank">2 C++的发展过程</a></div>
<div class="fj1"><span>第3集</span><a href="/eschool/video/autohtml/310/261/2.shtml" target="_blank">3 C与C++的区别</a></div>
<div class="fj1"><span>第4集</span><a href="/eschool/video/autohtml/310/261/3.shtml" target="_blank">4 学习C++之前要先学C吗?</a></div>
<div class="fj1"><span>第5集</span><a href="/eschool/video/autohtml/310/261/4.shtml" target="_blank">5 C++与其他语言的区别</a></div>
<div class="fj1"><span>第6集</span><a href="/eschool/video/autohtml/310/261/5.shtml" target="_blank">6 C++版本及安装问题</a></div>
<div class="fj1"><span>第7集</span><a href="/eschool/video/autohtml/310/261/6.shtml" target="_blank">7 VS2005编译器</a></div>
<div class="fj1"><span>第1集</span><a href="/eschool/video/autohtml/310/281/0.shtml" target="_blank">1 简单的屏幕输出小程序</a></div>
<div class="fj1"><span>第2集</span><a href="/eschool/video/autohtml/310/281/1.shtml" target="_blank">2 输出语句的使用</a></div>
<div class="fj1"><span>第3集</span><a href="/eschool/video/autohtml/310/281/2.shtml" target="_blank">3 std::是什么?</a></div>
<div class="fj1"><span>第4集</span><a href="/eschool/video/autohtml/310/281/3.shtml" target="_blank">4 iostream与iostream.h区别</a></div>
<div class="fj1"><span>第5集</span><a href="/eschool/video/autohtml/310/281/4.shtml" target="_blank">5 重名冲突</a></div>
<div class="fj1"><span>第6集</span><a href="/eschool/video/autohtml/310/281/5.shtml" target="_blank">6 注释</a></div>
<div class="fj1"><span>第1集</span><a href="/eschool/video/autohtml/310/301/0.shtml" target="_blank">1 函数演示</a></div>
<div class="fj1"><span>第2集</span><a href="/eschool/video/autohtml/310/301/1.shtml" target="_blank">2 函数的传参</a></div>
<div class="fj1"><span>第3集</span><a href="/eschool/video/autohtml/310/301/2.shtml" target="_blank">3 函数的返回值、参数与变量.swf</a></div>
<div class="fj1"><span>第4集</span><a href="/eschool/video/autohtml/310/301/3.shtml" target="_blank">4 函数的声明与定义</a></div>
<div class="fj1"><span>第5集</span><a href="/eschool/video/autohtml/310/301/4.shtml" target="_blank">5 局部变量</a></div>
<div class="fj1"><span>第6集</span><a href="/eschool/video/autohtml/310/301/5.shtml" target="_blank">6 全局变量</a></div>
<div class="fj1"><span>第1集</span><a href="/eschool/video/autohtml/310/302/0.shtml" target="_blank">1 C++数据类型</a></div>
<div class="fj1"><span>第2集</span><a href="/eschool/video/autohtml/310/302/1.shtml" target="_blank">2 什么是变量</a></div>
<div class="fj1"><span>第3集</span><a href="/eschool/video/autohtml/310/302/2.shtml" target="_blank">3 变量及数据如何存储在内存上</a></div>
<div class="fj1"><span>第4集</span><a href="/eschool/video/autohtml/310/302/3.shtml" target="_blank">4 布尔型</a></div>
<div class="fj1"><span>第5集</span><a href="/eschool/video/autohtml/310/302/4.shtml" target="_blank">5 字符型</a></div>
<div class="fj1"><span>第6集</span><a href="/eschool/video/autohtml/310/302/5.shtml" target="_blank">6 双字节型</a></div>
<div class="fj1"><span>第7集</span><a href="/eschool/video/autohtml/310/302/6.shtml" target="_blank">7 整型概述</a></div>
<div class="fj1"><span>第8集</span><a href="/eschool/video/autohtml/310/302/7.shtml" target="_blank">8 为什么使用补码</a></div>
<div class="fj1"><span>第9集</span><a href="/eschool/video/autohtml/310/302/8.shtml" target="_blank">9 整型变量的定义</a></div>
<div class="fj1"><span>第10集</span><a href="/eschool/video/autohtml/310/302/9.shtml" target="_blank">10 浮点型变量</a></div>
<div class="fj1"><span>第11集</span><a href="/eschool/video/autohtml/310/302/10.shtml" target="_blank">11 常量</a></div>

我们现在开始抓取网址:

curl http://www.enet.com.cn/eschool/video/autohtml/310/281/0.shtml | sed -n 's/\"/\n/gp'  | grep ^/eschool/video/autohtml/ > down

结果如下:

然后我们将网址补齐:

sed 's/\//http:\/\/www.enet.com.cn\//' down  > downdown

最后我们调用shell脚本

#!/bin/bash
for line in `cat downdown`
do
        curl $line | sed -n 's/\"/\n/gp' | grep ^http://images.enet.com.cn/eschool/c++/ >> download.txt
done

最后抓取的下载地址如下:

现在你可以用迅雷进行批量下载了!

抱歉!评论已关闭.