Computer Vision Datasets
Datasets who is the best at X ?
Computer Vision Datasets
Image & Vision Group - Datasets
MSR Image Recognition Challenge (IRC)
UMDFaces: An Annotated Face Dataset for Training Deep Networks
http://mmlab.ie.cuhk.edu.hk/datasets/comp_cars/
BoxCars: Improving Fine-Grained Recognition of Vehicles Using 3-D Bounding Boxes in Traffic Surveillance [IEEE T-ITS]
https://medusa.fit.vutbr.cz/traffic/research-topics/fine-grained-vehicle-recognition/boxcars-improving-vehicle-fine-grained-recognition-using-3d-bounding-boxes-in-traffic-surveillance/
Cars Dataset
Fashion-MNIST
https://tech.instacart.com/3-million-instacart-orders-open-sourced-d40d29ead6f2
https://arxiv.org/abs/1804.00525
Exclusively Dark (ExDark) Image Dataset
Caltech Pedestrian Detection Benchmark
Caltech Pedestrian Dataset Converter
https://github.com/mitmul/caltech-pedestrian-dataset-converter
CityPersons: A Diverse Dataset for Pedestrian Detection
http://mmcheng.net/msra10k/
http://www.di.ens.fr/willow/research/headdetection/
Brainwash dataset.
https://exhibits.stanford.edu/data/catalog/sx925dc9385
ILSVRC2015: Object detection from video (VID)
http://blog.mapillary.com/product/2017/05/03/mapillary-vistas-dataset.html
Multi-Human Parsing
https://lv-mhp.github.io/
COCO-Stuff 10K dataset v1.1
https://arxiv.org/abs/1612.03716 https://github.com/nightrome/cocostuff
http://sceneparsing.csail.mit.edu/
ADE20K
https://arxiv.org/abs/1608.05442
ImageNet-Utils
Collecting Multilingual Parallel Video Descriptions Using Mechanical Turk
UCF101 - Action Recognition Data Set
ActivityNet: A Large-Scale Video Benchmark for Human Activity Understanding
Paris6k
Oxford105k
UKB
NUS-WIDE
ImageNet-YahooQA
DeepFashion: In-shop Clothes Retrieval
Person Re-identification Datasets
http://www.ee.cuhk.edu.hk/~xgwang/CUHK_identification.html
PRW (Person Re-identification in the Wild) Dataset
https://www.tugraz.at/institute/icg/research/team-bischof/lrs/downloads/PRID11/
MARS (Motion Analysis and Re-identification Set) Dataset
https://drive.google.com/file/d/0B56OfSrVI8hubVJLTzkwV2VaOWM/view
3DPeS
http://www.openvisor.org/3dpes.asp
DukeMTMC-attribute
https://cvlab.epfl.ch/data/wildtrack
http://mclab.eic.hust.edu.cn/~pchen/project.html
http://aolpr.ntust.edu.tw/lab/download.html
CCPD: Chinese City Parking Dataset
Pychet Labeller
BBox-Label-Tool
http://www.cvpapers.com/datasets.html
Awesome Public Datasets
https://archive.ics.uci.edu/ml/datasets.html
Computer Vision Datasets
- website: http://clickdamage.com/sourcecode/index.html
- code: http://clickdamage.com/sourcecode/cv_datasets.php
- mirror: http://pan.baidu.com/s/1pJmqD4n
- blog: https://research.googleblog.com/2016/09/introducing-open-images-dataset.html
- github: https://github.com/openimages/dataset
- Academic Torrents: http://academictorrents.com/details/9e9194e21ce045deee8d811481b4cd676b20b06b
Image & Vision Group - Datasets
- intro: Image & Vision , Clothing & Fashion, Computer Graphics, Video Sequences
- homepage: http://caiivg.weebly.com/dataset.html
- intro: Google I/O Dataset, Names 100 Dataset, Clothing Attributes Dataset, Stanford Mobile Visual Search Dataset, CNN 2-Hours Videos Dataset
- homepage: http://huizhongchen.github.io/datasets.html#clothingattributedataset
Classification / Recognition
A Large-Scale Car Dataset for Fine-Grained Categorization and Verification- project page: http://mmlab.ie.cuhk.edu.hk/datasets/comp_cars/index.html
- arxiv: http://arxiv.org/abs/1506.08959
- intro: The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.
- homepage: http://www.cs.toronto.edu/~kriz/cifar.html
Face
The MegaFace Benchmark: 1 Million Faces for Recognition at Scale- homepage: http://megaface.cs.washington.edu/
- arxiv: http://arxiv.org/abs/1512.00596
MSR Image Recognition Challenge (IRC)
UMDFaces: An Annotated Face Dataset for Training Deep Networks
Vehicle
The Comprehensive Cars (CompCars) datasethttp://mmlab.ie.cuhk.edu.hk/datasets/comp_cars/
BoxCars: Improving Fine-Grained Recognition of Vehicles Using 3-D Bounding Boxes in Traffic Surveillance [IEEE T-ITS]
https://medusa.fit.vutbr.cz/traffic/research-topics/fine-grained-vehicle-recognition/boxcars-improving-vehicle-fine-grained-recognition-using-3d-bounding-boxes-in-traffic-surveillance/
Cars Dataset
- intro: contains 16,185 images of 196 classes of cars.
- homepage: http://ai.stanford.edu/~jkrause/cars/car_dataset.html
Scene Recognition
Places: An Image Database for Deep Scene Understanding- project page: http://places.csail.mit.edu/index.html
- arxiv: https://arxiv.org/abs/1610.02055
- intro: Places2 contains more than 10 million images comprising 400+ unique scene categories
- homepage: http://places2.csail.mit.edu/
MNIST
EMNIST: an extension of MNIST to handwritten lettersFashion-MNIST
- arxiv: https://arxiv.org/abs/1708.07747
- github: https://github.com/zalandoresearch/fashion-mnist
- benchmark: http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/
Food
3 Million Instacart Orders, Open Sourcedhttps://tech.instacart.com/3-million-instacart-orders-open-sourced-d40d29ead6f2
Detection
YouTube-BoundingBoxes: A Large High-Precision Human-Annotated Data Set for Object Detection in Video- intro: YouTube-BoundingBoxes (YT-BB)
- homepage: https://research.google.com/youtubebb/
- arxiv: https://arxiv.org/abs/1702.00824
https://arxiv.org/abs/1804.00525
Exclusively Dark (ExDark) Image Dataset
- intro: Exclusively Dark (ExDARK) dataset which to the best of our knowledge, is the largest collection of low-light images taken in very low-light environments to twilight (i.e 10 different conditions) to-date with image class and object level annotations.
- github: https://github.com/cs-chan/Exclusively-Dark-Image-Dataset
Face Detection
FDDB: Face Detection Data Set and Benchmark- homepage: http://vis-www.cs.umass.edu/fddb/index.html
- results: http://vis-www.cs.umass.edu/fddb/results.html
Pedestrian Detection
Caltech Pedestrian Detection Benchmark
Caltech Pedestrian Dataset Converter
https://github.com/mitmul/caltech-pedestrian-dataset-converter
CityPersons: A Diverse Dataset for Pedestrian Detection
- arxiv: https://arxiv.org/abs/1702.05693
- bitbucket: https://bitbucket.org/shanshanzhang/citypersons
- supplemental: http://openaccess.thecvf.com/content_cvpr_2017/supplemental/Zhang_CityPersons_A_Diverse_2017_CVPR_supplemental.pdf
- intro: CrowdHuman contains 15000, 4370 and 5000 images for training, validation, and testing, respectively. a total of 470K human instances from train and validation subsets and 23 persons per image, with various kinds of occlusions in the dataset
- homepage: https://sshao0516.github.io/CrowdHuman/
- intro: collected on-board a moving vehicle in 31 cities of 12 European countries, over 238200 person instances manually labeled in over 47300 images, contains a large number of person orientation annotations (over 211200)
- arxiv: https://arxiv.org/abs/1805.07193
Vehicle Detection
Toyota Motor Europe (TME) Motorway Dataset- intro: composed by 28 clips for a total of approximately 27 minutes (30000+ frames) with vehicle annotation
- homepage: http://cmp.felk.cvut.cz/data/motorway/
- intro: 9,850 vehicle images, sizes of 16001200 and 19201080 captured from two cameras at different time and places in the dataset
- homepage: http://iitlab.bit.edu.cn/mcislab/vehicledb/
Salieny Detection
MSRA10K Salient Object Databasehttp://mmcheng.net/msra10k/
Logo Detection
QMUL-OpenLogo: Open Logo Detection Challenge- intro: QMUL-OpenLogo contains 27,083 images from 352 logo classes, built by aggregating and refining 7 existing datasets and establishing an open logo detection evaluation protocol
- homepage: https://qmul-openlogo.github.io/
Head Detection
SCUT-HEAD- intro: SCUT HEAD is a large-scale head detection dataset, including 4405 images labeld with 111251 heads.
- github: https://github.com/HCIILAB/SCUT-HEAD-Dataset-Release
http://www.di.ens.fr/willow/research/headdetection/
Brainwash dataset.
https://exhibits.stanford.edu/data/catalog/sx925dc9385
Detection From Video
YouTube-Objects dataset v2.2ILSVRC2015: Object detection from video (VID)
Segmentation
Mapillary Vistas Dataset
Mapillary Vistas Dataset- intro: 25,000 high-resolution images, 100 object categories, 60 of those instance-specific https://www.mapillary.com/dataset/
http://blog.mapillary.com/product/2017/05/03/mapillary-vistas-dataset.html
Multi-Human Parsing
https://lv-mhp.github.io/
PASCAL VOC
Augmented Pascal VOC
http://home.bharathh.info/pubs/codes/SBD/download.htmlSupervisely Person
- homepage: https://supervise.ly/
- blog: https://hackernoon.com/releasing-supervisely-person-dataset-for-teaching-machines-to-segment-humans-1f1fc1f28469
Microsoft COCO
- homepage: http://mscoco.org/
- github: https://github.com/pdollar/coco
The Oxford-IIIT Pet Dataset
- intro: a 37 category pet dataset with roughly 200 images for each class. All images have an associated ground truth annotation of breed, head ROI, and pixel level trimap segmentation
- homepage: http://www.robots.ox.ac.uk/~vgg/data/pets/
COCO-Stuff
COCO-Stuff: Thing and Stuff Classes in ContextCOCO-Stuff 10K dataset v1.1
https://arxiv.org/abs/1612.03716 https://github.com/nightrome/cocostuff
Scene Parsing
MIT Scene Parsing Benchmarkhttp://sceneparsing.csail.mit.edu/
ADE20K
- intro: train: 20,120 images, val: 2000 images. contains 150 stuff/object category labels (e.g., wall, sky, and tree) and 1,038 imagelevel scene descriptors (e.g., airport terminal, bedroom, and street).
- homepage: http://groups.csail.mit.edu/vision/datasets/ADE20K/
https://arxiv.org/abs/1608.05442
ImageNet
ImageNet-Utils
- intro: Utils to help download images by id, crop bounding box, label images, etc.
- github: https://github.com/tzutalin/ImageNet_Utils
Captioning / Description
TGIF: A New Dataset and Benchmark on Animated GIF DescriptionCollecting Multilingual Parallel Video Descriptions Using Mechanical Turk
- intro: 1970 YouTube video snippets: 1200 training, 100 validation, 670 test
- homepage: http://www.cs.utexas.edu/users/ml/clamp/videoDescription/
Video
Dataset | # Videos | # Classes | Year | Manually Labeled ? |
---|---|---|---|---|
Kodak | 1,358 | 25 | 2007 | ✓ |
HMDB51 | 7000 | 51 | ||
Charades | 9848 | 157 | ||
MCG-WEBV | 234,414 | 15 | 2009 | ✓ |
CCV | 9,317 | 20 | 2011 | ✓ |
UCF-101 | 13,320 | 101 | 2012 | ✓ |
THUMOS-2 | 18,394 | 101 | 2014 | ✓ |
MED-2014 | ≈28,000 | 20 | 2014 | ✓ |
Sports-1M | 1M | 487 | 2014 | ✗ |
ActivityNet | 27,801 | 203 | 2015 | ✓ |
FCVID | 91,223 | 239 | 2015 | ✓ |
- homepage: http://crcv.ucf.edu/data/UCF101.php
ActivityNet: A Large-Scale Video Benchmark for Human Activity Understanding
- homepage: http://activity-net.org/
- download: http://activity-net.org/download.html
- github: https://github.com/activitynet
- homepage: https://github.com/gtoderici/sports-1m-dataset/blob/wiki/ProjectHome.md
- github: https://github.com/gtoderici/sports-1m-dataset/
- thumbnails: http://cs.stanford.edu/people/karpathy/deepvideo/classes.html
- intro: This dataset guides our research into unstructured video activity recogntion and commonsense reasoning for daily human activities.
- intro: The dataset contains 66,500 temporal annotations for 157 action classes, 41,104 labels for 46 object classes, and 27,847 textual descriptions of the videos.
- homepage: http://allenai.org/plato/charades/
- homepage: http://bigvid.fudan.edu.cn/FCVID/
- homepage: http://research.google.com/youtube8m/
- arxiv: http://arxiv.org/abs/1609.08675
- intro: 9 TB, 35,000,000 clips, 32 frames
- intro: Generating Videos with Scene Dynamics
- homepage: http://web.mit.edu/vondrick/tinyvideo/#data
- intro: Google
- homepage: https://deepmind.com/research/open-source/open-source-datasets/kinetics/
- arxiv: https://arxiv.org/abs/1705.06950
- intro: “Currently, e-VDS35 has 35 classes and a total of 2050 videos of roughly 10 seconds each (see histogram below). We are aiming to collect overall 1750 (50 × 35) videos with your help.”
- homepage: https://engineering.purdue.edu/elab/eVDS
- intro: Sortable and searchable compilation of video dataset
- arxiv: https://www.di.ens.fr/~miech/datasetviz/
Scene
SceneNet RGB-D: 5M Photorealistic Images of Synthetic Indoor Trajectories with Ground Truth- intro: Imperial College London
- project page: https://robotvault.bitbucket.org/scenenet-rgbd.html
- github: https://arxiv.org/abs/1612.05079
- github: https://github.com/jmccormac/pySceneNetRGBD
Autonomous Driving
BDD: Berkely Deep Drive- intro: 100,000 HD video sequences of over 1,100-hour driving experience across many different times in the day, weather conditions, and driving scenarios
- homepage: http://bdd-data.berkeley.edu/
- github: https://github.com/ucbdrive/bdd-data
OCR
COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images- homepage: http://vision.cornell.edu/se3/coco-text/
- arxiv: http://arxiv.org/abs/1601.07140
- intro: 32,285 high resolution images, 1,018,402 character instances, 3,850 character categories, 6 kinds of attributes
- homepage: https://ctwdataset.github.io/
- arxiv: https://arxiv.org/abs/1803.00085
Retrieval
Oxford5kParis6k
Oxford105k
UKB
NUS-WIDE
ImageNet-YahooQA
DeepFashion: In-shop Clothes Retrieval
- intro: 7,982 number of clothing items; 52,712 number of in-shop clothes images, and ~200,000 cross-pose/scale pairs; Each image is annotated by bounding box, clothing type and pose type.
- homepage: http://mmlab.ie.cuhk.edu.hk/projects/DeepFashion/InShopRetrieval.html
Person Re-ID
Dataset | Description |
---|---|
CUHK01 | 971 identities, 3884 images, manually cropped |
CUHK02 | 1816 identities, 7264 images, manually cropped |
CUHK03 | 1360 identities, 13164 images, manually cropped + automatically detected |
- homepage: http://robustsystems.coe.neu.edu/sites/robustsystems.coe.neu.edu/files/systems/projectpages/reiddataset.html
- github: https://github.com/RSL-NEU/person-reid-benchmark
http://www.ee.cuhk.edu.hk/~xgwang/CUHK_identification.html
PRW (Person Re-identification in the Wild) Dataset
- homepage: http://www.liangzheng.com.cn/Project/project_prw.html
- github: https://github.com/liangzheng06/PRW-baseline
- intro: CVPR 2017 spotlight
- arxiv: https://arxiv.org/abs/1604.02531
- intro: DukeMTMC-reID is a subset of the DukeMTMC for image-based re-identification, in the format of the Market-1501 dataset
- intro: 16,522 training images of 702 identities, 2,228 query images of the other 702 identities and 17,661 gallery images
- github: https://github.com/layumi/DukeMTMC-reID_evaluation
- intro: DukeMTMC4ReID dataset
- github: https://github.com/NEU-Gou/DukeReID
https://www.tugraz.at/institute/icg/research/team-bischof/lrs/downloads/PRID11/
MARS (Motion Analysis and Re-identification Set) Dataset
- intro: an extension of the Market-1501 dataset
- homepage: http://www.liangzheng.com.cn/Project/project_mars.html
- github: https://github.com/liangzheng06/MARS-evaluation
- intro: This repository provides the X-MARS dataset splits for image to video/tracklet evaluation
- github: https://github.com/andreas-eberle/x-mars
- intro: 15-camera (12 outdoor cameras, 3 indoor cameras), 4,101 Identities, 126,441 BBoxes
- homepage: http://www.pkuvmc.com/publications/longhui.html
- soa: http://www.pkuvmc.com/publications/state_of_the_art.html
- intro: train/test identities: 1,975/756
- homepage: http://liuyu.us/dataset/lpw/
https://drive.google.com/file/d/0B56OfSrVI8hubVJLTzkwV2VaOWM/view
3DPeS
http://www.openvisor.org/3dpes.asp
Fashion
Large-scale Fashion (DeepFashion) Database- intro: Attribute Prediction, Consumer-to-shop Clothes Retrieval, In-shop Clothes Retrieval, and Landmark Detection
- homepage: http://mmlab.ie.cuhk.edu.hk/projects/DeepFashion.html
- intro: 15 clothing classes, 88951 images
- homepage: http://people.ee.ethz.ch/~lbossard/projects/accv12/index.html
Attribute Datasets
Attribute Datasets- intro: in total 41,585 pedestrian samples, each of which is annotated with 72 attributes as well as viewpoints, occlusions, body parts information
- homepage: https://www.ecse.rpi.edu/homepages/cvrl/database/AttributeDataset.htm
Pedestrian Attribute Recognition
A Richly Annotated Dataset for Pedestrian Attribute Recognition- homepage: http://rap.idealtest.org/
- arxiv: https://arxiv.org/abs/1603.07054
- intro: PEdesTrian Attribute (PETA)
- homepage: http://mmlab.ie.cuhk.edu.hk/projects/PETA.html
- paper: http://personal.ie.cuhk.edu.hk/~pluo/pdf/mm14.pdf
DukeMTMC-attribute
Tracking
UA-DETRAC: A New Benchmark and Protocol for Multi-Object Detection and Tracking- homepage: http://detrac-db.rit.albany.edu/
- arxiv: https://arxiv.org/abs/1511.04136
- intro: DukeMTMC aims to accelerate advances in multi-target multi-camera tracking. It provides a tracking system that works within and across cameras, a new large scale HD video data set recorded by 8 synchronized cameras with more than 7,000 single camera trajectories and over 2,000 unique identities
- homepage: http://vision.cs.duke.edu/DukeMTMC/
https://cvlab.epfl.ch/data/wildtrack
Color Classification
Vehicle Color Recognition on an Urban Road by Feature Contexthttp://mclab.eic.hust.edu.cn/~pchen/project.html
License Plate Detection and Recognition
Application-Oriented License Plate (AVOP) Databasehttp://aolpr.ntust.edu.tw/lab/download.html
CCPD: Chinese City Parking Dataset
- paper: http://openaccess.thecvf.com/content_ECCV_2018/papers/Zhenbo_Xu_Towards_End-to-End_License_ECCV_2018_paper.pdf
- github: https://github.com/detectRecog/CCPD
- dataset: https://drive.google.com/file/d/1fFqCXjhk7vE9yLklpJurEwP9vdLZmrJd/view
Tools
VoTT: Visual Object Tagging Tool 1.5- intro: Visual Object Tagging Tool: An electron app for building end to end Object Detection Models from Images and Videos
- github: https://github.com/Microsoft/VoTT
Pychet Labeller
- intro: A python based annotation/labelling toolbox for images. The program allows the user to annotate individual objects in images.
- github: https://github.com/sbargoti/pychetlabeller
- intro: Tool for reading and writing datasets of tensors in a Lightning Memory-Mapped Database (LMDB). Designed to manage machine learning datasets with fast reading speeds.
- github: https://github.com/vicolab/ml-pyxis
BBox-Label-Tool
- intro: A simple tool for labeling object bounding boxes in images
- github: https://github.com/puzzledqs/BBox-Label-Tool
- intro: A GUI tool for conveniently label the objects in video, using the powerful object tracking.
- github: https://github.com//hahnyuan/video_labeler
- intro: Computer Vision Annotation Tool (CVAT) is a web-based tool which helps to annotate video and images for Computer Vision algorithms
- github: https://github.com/opencv/cvat
Artist
BAM! The Behance Artistic Media Dataset- intro: 2.5M artwork urls, 393K attribute labels, 74K short image descriptions/captions
- project page: https://bam-dataset.org/
- arxiv: https://arxiv.org/abs/1704.08614
Resources
CV Datasets on the webhttp://www.cvpapers.com/datasets.html
Awesome Public Datasets
- intro: An awesome list of high-quality open datasets in public domains (on-going). By everyone, for everyone!
- github: https://github.com/caesar0301/awesome-public-datasets
https://archive.ics.uci.edu/ml/datasets.html
Comments
Post a Comment