转载地址:http://blog.csdn.net/zhubenfulovepoem/article/details/7191794
以下是computer vision:algorithm and application计算机视觉算法与应用这本书中附录里的关于计算机视觉的一些测试数据集和源码站点,我整理了下,加了点中文注解。
ComputerVision:
Algorithms and Applications
Richard Szeliski
在本书的最好附录中,我总结了一些对学生,教授和研究者有用的附加材料。这本书的网址http://szeliski.org/Book包含了更新的数据集和软件,请同样访问他。
C.1 数据集
一个关键就是用富有挑战和典型的数据集来测试你算法的可靠性。当有背景或者他人的结果是可行的,这种测试可能甚至包含更多的信息(和质量更好)。
经过这些年,大量的数据集已经被提出来用于测试和评估计算机视觉算法。许多这些数据集和软件被编入了计算机视觉的主页。一些更新的网址,像CVonline
(http://homepages.inf.ed.ac.uk/rbf/CVonline ), VisionBib.Com (http://datasets.visionbib.com/ ),
and Computer Vision online (http://computervisiononline.com/ ), 有更多最新的数据集和软件。
下面,我列出了一些用的最多的数据集,我将它们让章节排列以便它们联系更紧密。
第二章:图像信息
CUReT: Columbia-Utrecht 反射率和纹理数据库Reflectance and TextureDatabase, http://www1.cs.columbia.edu/CAVE/software/curet/ (Dana, van Ginneken, Nayaret
al. 1999).
Middlebury Color Datasets:不同摄像机拍摄的图像,注册后用于研究不同的摄像机怎么改变色域和彩色registeredcolor images taken by different cameras to study how they transform gamuts andcolors, http://vision.middlebury.edu/color/data/ Chakrabarti,
Scharstein, and Zickler 2009).
第三章:图像处理
Middlebury test datasets forevaluating MRF minimization/inference algorithms评估隐马尔科夫随机场最小化和推断算法,
http://vision.middlebury.edu/MRF/results/ (Szeliski, Zabih, Scharstein et al. 2008).
第四章:特征检测和匹配
Affine Covariant Featuresdatabase(反射协变的特征数据集) for evaluating feature detector and descriptor matching quality andrepeatability(评估特征检测和描述匹配的质量和定位精度), http://www.robots.ox.ac.uk/~vgg/research/affine/
(Miko-lajczyk and Schmid 2005;Mikolajczyk, Tuytelaars, Schmid et al. 2005).
Database of matched imagepatches for learning (图像斑块匹配学习数据库)and feature descriptor evaluation(特征描述评估数据库),
http://cvlab.epfl.ch/~brown/patchdata/patchdata.html
(Winder and Brown 2007;Hua,Brown, and Winder 2007).
第五章;分割
BerkeleySegmentation Dataset(分割数据库) and Benchmark of 1000 images labeled by 30 humans,(30个人标记的1000副基准图像)along with an evaluation,http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/segbench/ (Martin,
Fowlkes, Tal et al.2001).
Weizmann segmentationevaluation database of 100 grayscale images with ground truth segmentations,
http://www.wisdom.weizmann.ac.il/~vision/Seg EvaluationDB/index.html
(Alpert, Galun, Basri et al. 2007).
第八章:稠密运动估计
TheMiddlebury optic flow evaluation(光流评估) Web site, http://vision.middlebury.edu/flow/data/
(Baker,Scharstein, Lewis et al. 2009).
The Human-Assisted MotionAnnotation database,(人类辅助运动数据库)
http://people.csail.mit.edu/celiu/motionAnnotation/ (Liu, Freeman, Adelson etal. 2008)
第十章:计算机摄像学
High DynamicRange radiance(辐射)maps, http://www.debevec.org/Research/HDR/
(De-bevecand Malik 1997).
Alpha matting evaluation Website, http://alphamatting.com/ (Rhemann, Rother, Wang
et al. 2009).
第十一章:Stereo correspondence立体对应
Middlebury Stereo Datasets andEvaluation, http://vision.middlebury.edu/stereo/ (Scharstein
and Szeliski 2002).
StereoClassification(立体分类) and Performance Evaluation(性能评估) of different aggregation(聚类) costs for stereo matching(立体匹配),http://www.vision.deis.unibo.it/spe/SPEHome.aspx (Tombari,
Mat-
toccia, Di Stefano et al.2008).
Middlebury Multi-View StereoDatasets,
http://vision.middlebury.edu/mview/data/ (Seitz,Curless, Diebel etal. 2006).
Multi-view and Oxford Collegesbuilding reconstructions,
http://www.robots.ox.ac.uk/~vgg/data/data-mview.html .
Multi-View Stereo Datasets, http://cvlab.epfl.ch/data/strechamvs/ (Strecha, Fransens,
and Van Gool 2006).
Multi-View Evaluation, http://cvlab.epfl.ch/~strecha/multiview/ (Strecha, von Hansen,
Van Gool et al. 2008).
第十二章:3D重建
HumanEva: synchronized video(同步视频) and motion capture (动作捕捉)dataset for evaluation ofarticulated human motion, http://vision.cs.brown.edu/humaneva/ Sigal,
Balan, and Black 2010).
第十三章:图像渲染
The (New) Stanford Light FieldArchive, http://lightfield.stanford.edu/
(Wilburn, Joshi,Vaish et al.2005).
Virtual Viewpoint Video:multi-viewpoint video with per-frame depth maps,
http://research.microsoft.com/en-us/um/redmond/groups/ivm/vvv/ (Zitnick, Kang, Uytten-
daele et al. 2004).
第十四章:识别
查找一系列的视觉识别数据库,在表14.1–14.2.除了那些,这里还有:
Buffy pose classes, http://www.robots.ox.ac.uk/~vgg/data/ buffy pose classes/ andBuffy
stickmen V2.1, http://www.robots.ox.ac.uk/~vgg/data/stickmen/index.html (Ferrari,Marin-
Jimenez, and Zisserman 2009;Eichner and Ferrari 2009).
H3D database of pose/jointannotated photographs of humans,
http://www.eecs.berkeley.edu/~lbourdev/h3d/ (Bourdev and Malik 2009).
Action Recognition Datasets,http://www.cs.berkeley.edu/projects/vision/action, has point-
ers toseveral datasets for action and activity recognition, as well as some papers.(有一些关于人活动和运动的数据库和论文) The humanaction database athttp://www.nada.kth.se/cvap/actions/ 包含更多的行动序列。
C.2 软件资源
一个对于计算机视觉算法最好的资源就是开源视觉图像库(opencv)(http://opencv.willowgarage.com/wiki/),他有在intel的Gary Bradski和他的同事开发,现在由Willow Garage (Bradsky and Kaehler 2008)维护和扩展。一部分可利用的函数在http://opencv.willowgarage.com/documentation/cpp/中:
图像处理和变换 (滤波,形态学,金字塔);
图像几何学的变换 (旋转,改变大小);
混合图像变换 (傅里叶变换,距离变换);
直方图;
分割 (分水岭, mean shift);
特征检测 (Canny, Harris, Hough, MSER, SURF);
运动分析和物体分析 (Lucas–Kanade, mean shift);
相机矫正和3D重建
机器学习 (k nearest neighbors, 支持向量机, 决策树, boost-
ing, 随机树, expectation-maximization, 和神经网络).
Intel的Performance Primitives (IPP)library, http://software.intel.com/en-us/intel-ipp/,包含
各种各样的图像处理任务的最佳优化代码,许多opencv中的例子利用了这个库,加入他安装了,程序运行得更快。依据功能,他和Opencv有很多相同的运算处理,并且加上了额外的库针对图像视频压缩,信号语音处理和矩阵代数。
MTALAB中的Image Processing Toolbox图像处理工具,http://www.mathworks.com/products/image/,包含常规的处理,空域变换(旋转,改变大小),常规正交,图像分析和统计学(变边缘,哈弗变换),图像增强(自适应直方图均衡,中值滤波),图像恢复(去模糊),线性滤波(卷积),图像变换(傅里叶,离散余弦变换)和形态学操作(连通域和距离变换)
两个比较旧的库,它们没有被发展,但是包含了一些的有用的常规操作:
VXL (C++Libraries for Computer Vision Research and Implemen-tation, http://vxl.sourceforge.net/)
LTI-Lib 2 (http://www.ie.itcr.ac.cr/palvarado/ltilib-2/homepage/ ).
图像编辑和视图包,例如Windows Live Photo Gallery, iPhoto, Picasa,GIMP, 和 IrfanView,它们对执行这些处理非常有用:常规处理任务,格式转换,观测你的结果。它们同样可以用于对图像处理算法有趣的实现参考,例如色调调整和去噪。
这里他也有一些软件包和基础框架对你建一个实时视频处理的DEMOS很有用,Vision on Tap(http://www.visionontap.com/ )提供一个可以实时处理你的网络摄像头的网页服务(Chiu and Raskar 2009)。Video-Man (VideoManager, http://videomanlib.sourceforge.net/处理实时的基于视频的DEMOS和应用非常有用,你也可以用MATLAB中的imread直接从任何URl(例如网络摄像头)中读取视频。
下面,我列出了一些额外的网络资源,让章节排列以便它们看起来联系更紧密:
第三章:图像处理
matlabPyrTools—MATLAB 下的源码对于拉普拉斯变换,金字塔, QMF/小波, 和
steerable pyramids, http://www.cns.nyu.edu/~lcv/software.php (Simoncelli and Adel-
son 1990a; Simoncelli,Freeman, Adelson et al. 1992).
BLS-GSM 图像去噪, http://decsai.ugr.es/~javier/denoise/ (Portilla, Strela,Wain-
wright et al. 2003).
Fast bilateral filtering code(快速双边滤波), http://people.csail.mit.edu/jiawen/#code (Chen, Paris, and Durand 2007).
C++ implementation of the fastdistance transform algorithm,
http://people.cs.uchicago.edu/~pff/dt/ (Felzenszwalb andHuttenlocher 2004a).
GREYC’s Magic Image Converter,including image restoration software using regularization and anisotropicdiffusion, http://gmic.sourceforge.net/gimp.shtml (Tschumperl´
e and Deriche 2005).
第四章:图像特征检测和匹配
VLFeat, 一个开放便捷的计算机视觉算法库
http://vlfeat.org/ (Vedaldi and Fulkerson 2008).
SiftGPU: A GPU Implementationof Scale Invariant Feature Transform (SIFT),
GPU实现的尺度特征性变换
http://www.cs.unc.edu/~ccwu/siftgpu/ (Wu 2010).
SURF: Speeded Up RobustFeatures, http://www.vision.ee.ethz.ch/~surf/
(Bay, Tuyte-laars, and VanGool 2006).
FAST corner detection, http://mi.eng.cam.ac.uk/~er258/work/fast.html
(Rosten and Drum-mond 2005, 2006).
Linux binaries for affineregion detectors and descriptors, as well as MATLAB files to
compute repeatability andmatching scores,
http://www.robots.ox.ac.uk/~vgg/research/affine/
Kanade–Lucas–Tomasi featuretrackers: KLT, http://www.ces.clemson.edu/~stb/klt/ (Shi and Tomasi 1994);
GPU-KLT, http://cs.unc.edu/~cmzach/opensource.html (Zach,Gallup, and Frahm2008); Lucas–Kanade 20 Years On, http://www.ri.cmu.edu/projects/project
515.html (Baker and Matthews 2004).
第五章:分割
高效的基于图形的分割http://people.cs.uchicago.edu/~pff/segment
(Felzenszwalb and Huttenlocher2004b).
EDISON, 边缘检测和图像追踪,
http://coewww.rutgers.edu/riul/research/code/EDISON/
(Meer and Georgescu 2001; Comaniciu and Meer2002).
Normalized cuts segmentationincluding intervening contours,
http://www.cis.upenn.edu/~jshi/software/
(Shi and Malik 2000; Malik,Belongie, Leung et al. 2001).
Segmentation by weightedaggregation (SWA),利用加权集合的分割
http://www.cs.weizmann.ac.il/~vision/SWA (Alpert, Galun, Basri et al.2007).
第六章:基于特征的对齐和校准
Non-iterative PnP algorithm,(非迭代PnP算法)
http://cvlab.epfl.ch/software/EPnP (Moreno-Noguer, Lep-etit, and Fua 2007).
Tsai Camera Calibration(相机矫正) Software,
http://www-2.cs.cmu.edu/~rgw/TsaiCode.html (Tsai 1987).
Easy CameraCalibration Toolkit,(简易相机校准工具包) http://research.microsoft.com/en-us/um/people/zhang/ Calib/ (Zhang 2000).
Camera Calibration Toolbox forMATLAB,
http://www.vision.caltech.edu/bouguetj/calib doc/ ; a C version is included in OpenCV.
MATLAB functions for multipleview geometry,
http://www.robots.ox.ac.uk/~vgg/hzbook/code/ (Hartley and Zisserman2004).
第七章:运动重建
SBA: A generic sparse bundle(稀疏束) adjustment C/C++ package basedon the Levenberg–
Marquardt algorithm, http://www.ics.forth.gr/~lourakis/sba/ (Lourakis and Argyros 2009).
Simple sparse bundleadjustment (SSBA), http://cs.unc.edu/~cmzach/opensource.html .
Bundler, structure from motionfor unordered image collections(无序图像集),
http://phototour.cs.washington.edu/bundler/ (Snavely, Seitz, and Szeliski 2006).
第八章:稠密运动估计
光流, http://www.cs.brown.edu/~black/code.html (Black and Anan-
dan 1996).
Optical flow(光流) using total variation(全变量差) and conjugate gradientdescent(共轭梯度下降), http://people.csail.mit.edu/celiu/OpticalFlow/ (Liu 2009).
TV-L1 optical flow on the GPU, http://cs.unc.edu/~cmzach/opensource.html
(Zach,Pock, and Bischof2007a).
elastix: atoolbox for rigid(刚性) and nonrigid(非刚性) registration of images(配准图像), http://elastix.isi.uu.nl/ (Klein, Staring, and Pluim 2007).
Deformable image registration(可变形的配准图像) using discreteoptimization(离散最优化), http://www.mrf-registration.net/deformable/index.html
(Glocker, Komodakis, Tziritas et al. 2008).
第九章:图像缝合
Microsoft Research ImageCompositing Editor for stitching images,(图像拼接,图像合成)
http://research.microsoft.com/en-us/um/redmond/groups/ivm/ice/ .
第十章:计算机摄影学
HDRShop software for combiningbracketed exposures(包围式曝光) into high-dynamic range radiance images, http://projects.ict.usc.edu/graphics/HDRShop/.
Super-resolution(超分辨率) code,
http://www.robots.ox.ac.uk/~vgg/software/SR/ (Pickup 2007;Pickup, Capel,Roberts et al. 2007, 2009).
第十一章:立体对应
StereoMatcher, standalone C++stereo matching code,
http://vision.middlebury.edu/stereo/code/ (Scharstein and Szeliski2002).
Patch-based multi-view stereosoftware (PMVS Version 2),
http://grail.cs.washington.edu/software/pmvs/ (Furukawa and Ponce 2011).
第十二章:3D重建
Scanalyze: a system foraligning and merging range data,
http://graphics.stanford.edu/software/scanalyze/ (Curless and Levoy 1996).
MeshLab: software forprocessing, editing, and visualizing unstructured 3D triangular
meshes, http://meshlab.sourceforge.net/.
VRML viewers (various) arealso a good way to visualize texture-mapped 3D models.
节 12.6.4: Whole body modeling andtracking(全身建模和追踪)
Bayesian 3D person tracking(贝叶斯3D人体追踪), http://www.cs.brown.edu/~black/code.html (Sidenbladh,Black, and Fleet2000; Sidenbladh and Black 2003).
HumanEva: baseline code forthe tracking of articulated human motion,
http://vision.cs.brown.edu/humaneva/ (Sigal, Balan, and Black 2010).
节 14.1.1: Face detection(人脸检测)
Sample face detection code andevaluation tools,
http://vision.ai.uiuc.edu/mhyang/face-detection-survey.html.
节 14.1.2: Pedestrian detection(行人追踪)
A simple object detector withboosting,
http://people.csail.mit.edu/torralba/shortCourseRLOC/boosting/boosting.html
(Hastie, Tibshirani, and Friedman 2001;Torralba, Murphy, and Freeman 2007).
Discriminatively(有区别) trained deformable(可变形) part models, http://people.cs.uchicago.edu/~pff/latent/ (Felzenszwalb, Girshick,McAllester et al. 2010).
Upper-body detector(上身检测),
http://www.robots.ox.ac.uk/~vgg/software/UpperBody/ (Ferrari,Marin-Jimenez, andZisserman 2008).
2D articulated human poseestimation software,
http://www.vision.ee.ethz.ch/~calvin/articulated_human_pose_estimation_code/ (Eichner and Ferrari 2009).
节 14.2.2: Active appearance and 3Dshape models
AAMtools: An active appearancemodeling toolbox,
http://cvsp.cs.ntua.gr/software/AAMtools/ (Papandreou and Maragos2008).
节 14.3: Instance recognition
FASTANN and FASTCLUSTER forapproximate k-means (AKM),
http://www.robots.ox.ac.uk/~vgg/software/ (Philbin, Chum, Isard et al. 2007).
Feature matching using fastapproximate nearest neighbors,
http://people.cs.ubc.ca/~mariusm/index.php/FLANN/FLANN (Muja and Lowe 2009).
节 14.4.1: Bag of words(词袋)