+With nothing except descriptive metadata, we have few options for ranking images. The basic premise behind the current ranking algorithm is that the more that the user’s keywords appear in the metadata, the more likely that the image is relevant to the user’s work. There are plenty of ways to fine-tune this approach by adjusting the impact of each metadata field, how these fields are interpreted during indexing and querying, and which matching algorithms are chosen for each field, but ultimately, the ranking is still based off of a very limited amount of information. These fields tell us that an image is a plausible result for a search, but does nothing to tell us about the quality of the image. The end result is that our ranking algorithm will treat a blurry amateur photo on someone’s Flickr photo stream the same as a work by a master painter, as long as the keywords match. In an environment where we could hand-curate every work in our database, this would be acceptable; in the real world, where a lot of low-effort stuff gets uploaded to the internet, we need to find a way to separate the wheat from the chaff.
0 commit comments