کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
558220 1451691 2016 14 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Speech enhancement using Maximum A-Posteriori and Gaussian Mixture Models for speech and noise Periodogram estimation
ترجمه فارسی عنوان
بهبود گفتار با استفاده از مدل‌های مخلوط گوسی و A پسینی حداکثر و برای برآورد دوره نگار گفتار و سر و صدا
کلمات کلیدی
مدل مخلوط گوسی (GMM)؛ A پسینی حداکثر (MAP)؛ فیلتر وینر؛ گفتار
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
چکیده انگلیسی


• We use Gaussian Mixture Modelling (GMM) to model Periodograms of speech and noise.
• A method to find the proper size of GMMs is discussed.
• We propose a Maximum A-Posteriori (MAP) Periodogram estimation using GMMs.
• The GMMs are used to enhance the noisy speech using Wiener filter.

In speech enhancement, Gaussian Mixture Models (GMMs) can be used to model the Probability Density Function (PDF) of the Periodograms of speech and different noise types. These GMMs are created by applying the Estimate Maximization (EM) algorithm on large datasets of speech and different noise type Periodograms and hence classify them into a small number of clusters whose centroid Periodograms are the mean vectors of the GMMs. These GMMs are used to realize the Maximum A-Posteriori (MAP) estimation of the speech and noise Periodograms present in a noisy speech observation. To realize the MAP estimation, use of a constrained optimization algorithm is proposed in which relatively good enhancement results with high processing times are attained. Due to the use of constraints in the optimization algorithm, incorrect estimation results may arise due to possible local maxima. A simple analytic MAP algorithm is proposed to attain global maximums in lower calculation times. With the new method the complicated MAP formula is simplified as much as possible to find the maxima, through solving a set of equations and not through conventional numerical methods used in optimization. This method results in excellent speech enhancement with a relatively short processing time.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 36, March 2016, Pages 58–71
نویسندگان
, ,