The MediaMill Bi-concept dataset provides a new baseline for image retrieval beyond single concepts. Searching for the co-cooccurrence of two visual concepts in unlabeled images is an important step towards answering complex user queries. While traditional methods count on artificial combinations of individual single-concept detectors, bi-concept search is a new concept-based retrieval method, equipped with bi-concept detectors directly. Bi-concept search is found to be superior to oracle linear fusion of single-concept based search.
The MediaMill Bi-concept dataset contains:
biconcepts2012test.zip (25MB): all in one package.
Step 1. For each bi-concept w, first use your system, say 'systemX', to score the 50 positive test examples and the 10,000 negative test examples, and sort them. Save the sorted results to SimilarityIndex/test/systemX/w.txt. The file w.txt shall contain 10,050 lines, where each line starts with a photo id followed by the corresponding score.
Step 2. Assuming that the biconcepts2012test folder is placed at C:/VisualSearch/, set the variable ROOT_PATH in common.py to "C:/VisualSearch/". Add 'systemX' to the variable rankerNameList in pycode/compareSearchEngine.py, and run the python script to compute Average Precision scores of the individual systems.
If you have any question please contact Xirong Li at xirong@ruc.edu.cn.
Readme First
Xirong Li, Cees G. M. Snoek, Marcel Worring, and Arnold W. M. Smeulders. Harvesting Social Images for Bi-Concept Search. IEEE Transactions on Multimedia, vol. 14, iss. 4, pp. 1091-1104, 2012.