Toward characteristic audio shingles for efficient cross-version music retrieval

Abstract

The general goal of cross-version music retrieval is to identify all versions of a given piece of music by means of a short query audio fragment. To speed up the retrieval process, hashing techniques have been proposed, where the audio material is split up into small overlapping shingles (used as hashes) that consist of short feature subsequences. In this paper, we extend this work with the goal to minimize the number of hash lookups. To this end, one requires larger shingles that characterize the underlying piece of music to a high degree, while being robust to variations that occur across different versions. As our main contribution, we report on extensive experiments to highlight the delicate trade-off between the query length, feature parameters, shingle dimension, and index settings. These insights are of fundamental importance for building efficient cross-version retrieval systems that scale to millions of songs.

Publication
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on