Scalable distributed speech recognition using Gaussian mixture model-based block quantisation

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
567780	876155	2006	13 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Distributed speech recognition - تشخیص گفتار توزیع شده Gaussian mixture models - مدل مخلوط گاوسی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

Scalable distributed speech recognition using Gaussian mixture model-based block quantisation

چکیده انگلیسی

In this paper, we investigate the use of block quantisers based on Gaussian mixture models (GMMs) for the coding of Mel frequency-warped cepstral coefficient (MFCC) features in distributed speech recognition (DSR) applications. Specifically, we consider the multi-frame scheme, where temporal correlation across MFCC frames is exploited by the Karhunen–Loève transform of the block quantiser. Compared with vector quantisers, the GMM-based block quantiser has relatively low computational and memory requirements which are independent of bitrate. More importantly, it is bitrate scalable, which means that the bitrate can be adjusted without the need for re-training. Static parameters such as the GMM and transform matrices are stored at the encoder and decoder and bit allocations are calculated ‘on-the-fly’ without intensive processing. We have evaluated the quantisation scheme on the Aurora-2 database in a DSR framework. We show that jointly quantising more frames and using more mixture components in the GMM leads to higher recognition performance. The multi-frame GMM-based block quantiser achieves a word error rate (WER) of 2.5% at 800 bps, which is less than 1% degradation from the baseline (unquantised) word recognition accuracy, and graceful degradation down to a WER of 7% at 300 bps.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 48, Issue 6, June 2006, Pages 746–758

نویسندگان

Stephen So, Kuldip K. Paliwal,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Scalable distributed speech recognition using Gaussian mixture model-based block quantisation

دسترسی سریع

ارتباط

English Website