Data sets

This site lists datasets and repositories associated to Multimedia and Visual Information Systems

The IAPR TC-12 Benchmark

The image collection of the IAPR TC-12 Benchmark consists of 20,000 still natural images taken from locations around the world and comprising an assorted cross-section of still natural images. This includes pictures of different sports and actions, photographs of people, animals, cities, landscapes and many other aspects of contemporary life. read more>>

TRECVid benchmark

TREC conference series is sponsored by the National Institute of Standards and Technology (NIST) with additional support from other U.S. government agencies. The goal of the conference series is to encourage research in information retrieval by providing a large test collection, uniform scoring procedures, and a forum for organizations interested in comparing their results. In 2001 and 2002 the TREC series sponsored a video “track” devoted to research in automatic segmentation, indexing, and content-based retrieval of digital video. Beginning in 2003, this track became an independent evaluation (TRECVID) with a 2-day workshop taking place just before TREC. read more>>


The CoPhIR (Content-based Photo Image Retrieval) Test-Collection has been developed to make significant tests on the scalability of the SAPIR project infrastructure (SAPIR: Search In Audio Visual Content Using Peer-to-peer IR) for similarity search.

CoPhIR is now available to the research community to try and compare different indexing technologies for similarity search, with scalability being the key issue. read more >>

The Amsterdam Library of Images

ALOI is a color image collection of one-thousand small objects, recorded for scientific purposes. In order to capture the sensory variation in object recordings, we systematically varied viewing angle, illumination angle, and illumination color for each object, and additionally captured wide-baseline stereo images. We recorded over a hundred images of each object, yielding a total of 110,250 images for the collection. read more >>

Collections from Web.20 APIs

With the rise of usr-generated content in collaborative tagging environments, it is relatively straightforward to gather large media collections via site-provided APIs.

The MediaMill challenge

The MediaMill challenge builds upon the TRECVID collection defined above. It gives annotations and intermediate results for 101 semantic concepts. It provides an easy entry point to perform semantic video indexing on a large collection. read more>>

Benchmarking Suite Retrieval Analyzer of the University of Arizona

The benchmarking suite Retrieval Analyzer includes a collection of three mapping methods described in thispaper. The inputs to Retrieval Analyzer are a vector of computer scores (your image retrieval scores for the image pairs we have provided in our ground truth data) and corresponding human scores (our ground truth data). The outputs consist of a correlation score and an estimated precision-recall curve. We provide an option for chosing from any of the three mapping methods, but we recommend using the method that maximizes the correlation score. read more>>

CVOnline list

CVOnline. List of Computer Vision public available datasets. read more>>

ChaLearn repository

ChaLearn organizes world challenges on Machine Learning, with a large set of public databases related to different Pattern Recognition and Computer Vision applications. read more >>

It also contains a series of Looking at People data sets related to different aspects of automatic human analysis from multimedia data. read more >>


ImageCLEF‘s Multimedia Retrieval in CLEF.


ImageNet is an image database organized according to the WordNet hierarchy (currently only the nouns), in which each node of the hierarchy is depicted by hundreds and thousands of images. Currently, it includes an average of over five hundred images per node. The aim of ImageNet is to become a useful resource for researchers, educators, students and all of you who share our passion for pictures. read more >>


MediaEval benchmarking initiative for multimedia evaluation. The multi in multimedia: speech, audio, visual content, tags, users, context. read more >>