MediaMill Datasets

MediaMill Bi-concepts

The MediaMill Bi-concept dataset provides a new baseline for image retrieval beyond single concepts. Searching for the co-cooccurrence of two visual concepts in unlabeled images is an important step towards answering complex user queries. While traditional methods count on artificial combinations of individual single-concept detectors, bi-concept search is a new concept-based retrieval method, equipped with bi-concept detectors directly. Bi-concept search is found to be superior to oracle linear fusion of single-concept based search.

Bi-concepts

The MediaMill Bi-concept dataset contains:

  • Ground truth for 15 bi-concepts and 1 tri-concept. Each bi-concept has 50 positive test examples, and 10,000 negative test examples. See the folder Annotations.
  • Bi-concept image search results, retrieved by the three systems, i.e., social, borda, and full, with varying configurations as described in the bi-concept paper.
  • Python code for comparing your system with the baseline.

Download

biconcepts2012test.zip (25MB): all in one package.

To compare your system with the baselines in two steps

Step 1. For each bi-concept w, first use your system, say 'systemX', to score the 50 positive test examples and the 10,000 negative test examples, and sort them. Save the sorted results to SimilarityIndex/test/systemX/w.txt. The file w.txt shall contain 10,050 lines, where each line starts with a photo id followed by the corresponding score.

Step 2. Assuming that the biconcepts2012test folder is placed at C:/VisualSearch/, set the variable ROOT_PATH in common.py to "C:/VisualSearch/". Add 'systemX' to the variable rankerNameList in pycode/compareSearchEngine.py, and run the python script to compute Average Precision scores of the individual systems.

Contact

If you have any question please contact Xirong Li at xirong@ruc.edu.cn.

Readme First
Xirong Li, Cees G. M. Snoek, Marcel Worring, and Arnold W. M. Smeulders. Harvesting Social Images for Bi-Concept Search. IEEE Transactions on Multimedia, vol. 14, iss. 4, pp. 1091-1104, 2012.