Multiple fundamental frequency estimation based on sparse representations in a structured dictionary

Article ID	Journal	Published Year	Pages	File Type
558859	Digital Signal Processing	2013	11 Pages	PDF

Abstract

Automatic transcription of polyphonic music is an important task in audio signal processing, which involves identifying the fundamental frequencies (pitches) of several notes played at a time. Its difficulty stems from the fact that harmonics of different notes tend to overlap, especially in western music. This causes a problem in assigning the harmonics to their true fundamental frequencies, and in deducing spectra of several notes from their sum. We present here a multi-pitch estimation algorithm based on sparse representations in a structured dictionary, suitable for the spectra of music signals. In the vectors of this dictionary, most of the elements are forced to be zero except the elements that represent the fundamental frequencies and their harmonics. Thanks to the structured dictionary, the algorithm does not require a diverse or a large dataset for training and is computationally more efficient than alternative methods. The performance of the proposed structured dictionary transcription system is empirically examined, and its advantage is demonstrated compared to alternative dictionary learning methods.

► We present a multi-pitch estimation algorithm using a structured dictionary. ► The resulting sparse representation is suitable for music signals. ► The proposed algorithm is computationally efficient. ► It does not require a large training dataset.

Keywords

Music information retrieval Sparse representations