Article ID Journal Published Year Pages File Type
562029 Signal Processing 2006 15 Pages PDF
Abstract

We consider two approaches for sparse decomposition of polyphonic music: a time-domain approach based on a shift-invariant model, and a frequency-domain approach based on phase-invariant power spectra. When trained on an example of a MIDI-controlled acoustic piano recording, both methods produce dictionary vectors or sets of vectors which represent underlying notes, and produce component activations related to the original MIDI score. The time-domain method is more computationally expensive, but produces sample-accurate spike-like activations and can be used for a direct time-domain reconstruction. The spectral-domain method discards phase information, but is faster than the time-domain method and retains more higher-frequency harmonics. These results suggest that these two methods would provide a powerful yet complementary approach to automatic music transcription or object-based coding of musical audio.

Related Topics
Physical Sciences and Engineering Computer Science Signal Processing
Authors
, , , ,