Computational auditory induction as a missing-data model-fitting problem with Bregman divergence

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
567532	876100	2011	19 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

EM algorithm - الگوریتم EM Non-negative matrix factorization - تقسیم ماتریس غیر منفی Missing data - داده های گم شده Auxiliary function - عملکرد کمکی Bregman divergence - واگرایی Bregman

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

Computational auditory induction as a missing-data model-fitting problem with Bregman divergence

چکیده انگلیسی

The human auditory system has the ability, known as auditory induction, to estimate the missing parts of a continuous auditory stream briefly covered by noise and perceptually resynthesize them. In this article, we formulate this ability as a model-based spectrogram analysis and clustering problem with missing data, show how to solve it using an auxiliary function method, and explain how this method is generally related to the expectation–maximization (EM) algorithm for a certain type of divergence measures called Bregman divergences, thus enabling the use of prior distributions on the parameters. We illustrate how our method can be used to simultaneously analyze a scene and estimate missing information with two algorithms: the first, based on non-negative matrix factorization (NMF), performs analysis of polyphonic multi-instrumental musical pieces. Our method allows this algorithm to cope with gaps within the audio data, estimating the timbre of the instruments and their pitch, and reconstructing the missing parts. The second, based on a recently introduced technique for the analysis of complex acoustical scenes called harmonic-temporal clustering (HTC), enables us to perform robust fundamental frequency estimation from incomplete speech data.

The human auditory system has the ability, known as auditory induction, to estimate the missing parts of a continuous auditory stream briefly covered by noise and perceptually resynthesize them. In this article, we formulate this ability as a model-based spectrogram analysis and clustering problem with missing data, show how to solve it using an auxiliary function method, and explain how this method can be generally related to the EM algorithm for a certain type of divergence measures called Bregman divergences, enabling the use of prior distributions on the parameters. We illustrate how our method can be used to simultaneously analyze a scene and estimate missing information into it through the development of two algorithms: the first, based on non-negative matrix factorization (NMF), enables us to analyze a polyphonic multi-instrumental music piece in spite of the presence of gaps, estimating the timber of the instruments, their activations’ time and pitch, and reconstructing the missing parts; the second, based on a recently introduced technique for the analysis of complex acoustical scenes called Harmonic-Temporal Clustering (HTC), enables us to perform robust fundamental frequency estimation of speech on incomplete data.Figure optionsDownload as PowerPoint slideResearch highlights
► Auditory induction as a model-based spectrogram analysis problem with missing data.
► Simultaneously analyze acoustic scenes and estimate missing information.
► Link between missing-data model fitting and the EM algorithm for Bregman divergences.
► Robust F0 estimation of incomplete speech data using Harmonic-Temporal Clustering.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 53, Issue 5, May–June 2011, Pages 658–676

نویسندگان

Jonathan Le Roux, Hirokazu Kameoka, Nobutaka Ono, Alain de Cheveigné, Shigeki Sagayama,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Computational auditory induction as a missing-data model-fitting problem with Bregman divergence

دسترسی سریع

ارتباط

English Website