Extracting predominant local pulse information from music recordings

Abstract

The extraction of tempo and beat information from music recordings constitutes a challenging task in particular for non-percussive music with soft note onsets and time-varying tempo. In this paper, we introduce a novel mid-level representation that captures musically meaningful local pulse information even for the case of complex music. Our main idea is to derive for each time position a sinusoidal kernel that best explains the local periodic nature of a previously extracted note onset representation. Then we employ an overlap-add technique accumulating all these kernels over time to obtain a single function that reveals the predominant local pulse (PLP). Our concept introduces a high degree of robustness to noise and distortions resulting from weak and blurry onsets. Furthermore, the resulting PLP curve reveals the local pulse information even in the presence of continuous tempo changes and indicates a kind of confidence in the periodicity estimation. As further contribution, we show how our PLP concept can be used as a flexible tool for enhancing tempo estimation and beat tracking. The practical relevance of our approach is demonstrated by extensive experiments based on music recordings of various genres.

Publication
IEEE Transactions on Audio, Speech, and Language Processing