Currently, it is impossible to compare the performance of image retrieval algorithms when the transport medium involves the Internet. Use of the internet exposes two major limitations in current image retrieval benchmarking methods.
Firstly, researchers measure performance using private data sets. Since each data set is different, it becomes controversial to compare retrieval algorithms developed by different researchers.
Secondly, the performance of image retrieval algorithms is usually measured on a test platform that is isolated from network delays. Measuring algorithmic performance assumes that the test platform is tuned so that the memory and disk subsystems do not delay processor performance. Shared contention among these subsystems is minimized so there is very little variance in the measured execution times. The internet, being a shared medium, changes these assumptions dramatically. Gateways, caching, proxy servers, etc., introduce unpredictable delays. The challenge is to take this level of variance into account.
In other engineering areas, performance analysts have developed benchmarks to compare relational database system performance which also includes Internet delays. However, those benchmarks cannot easily be adapted for image retrieval. While the more common retrieval algorithms query a database for exact matches, in the case of images there are false hits and missed responses. Moreover, an image retrieval algorithm must rank the elements in the query response. Imaging researchers employ an auxiliary data set called "ground truth" for this purpose.
We have developed a new image retrieval benchmark called "Benchathlon" which incorporates these features for measuring image retrieval performance over the internet. In this paper, we present the prototype version together with some preliminary results. Feedback from participants in the live Benchathlon at EI 2001 will be used to improve future Benchathlon benchmarking contests.