دانلود رایگان مقاله: تجزیه و تحلیل ویژگی برای تخمین اعتماد به نفس تشخیصی در تشخیص گفتار

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
558274	874889	2014	32 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Feature analysis for discriminative confidence estimation in spoken term detection

ترجمه فارسی عنوان

تجزیه و تحلیل ویژگی برای تخمین اعتماد به نفس تشخیصی در تشخیص گفتار

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

تجزیه و تحلیل ویژگی، اعتماد محرمانه تشخیص اصطلاح گفتاری، تشخیص گفتار

Feature analysis - تجزیه و تحلیل ویژگی Spoken term detection - تشخیص اصطلاح گفتاری Speech recognition - تشخیص گفتار

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش مقاله

تجزیه و تحلیل ویژگی برای تخمین اعتماد به نفس تشخیصی در تشخیص گفتار

چکیده انگلیسی

• Feature analysis for spoken term detection (STD) on English (meeting domain) and Spanish (read speech) data in a discriminative confidence estimation framework.
• Feature analysis is based on groups that are defined according to their information sources: lattice-based features, duration-based features, lexical features, Levenshtein distance-based features, position and prosodic features (pitch and energy).
• Feature analysis employs two well-known and established models: linear regression (a generative approach) and logistic linear regression (a discriminative approach). Individual and incremental analyses are presented for both models.
• Results demonstrate significant improvement with the 3–5 most informative features compared with using the single best feature for STD confidence estimation.
• The best feature set comprises features from different groups: lattice-based and lexical features are among the most informative groups in general, and duration and energy are more informative for read speech data.

Discriminative confidence based on multi-layer perceptrons (MLPs) and multiple features has shown significant advantage compared to the widely used lattice-based confidence in spoken term detection (STD). Although the MLP-based framework can handle any features derived from a multitude of sources, choosing all possible features may lead to over complex models and hence less generality. In this paper, we design an extensive set of features and analyze their contribution to STD individually and as a group. The main goal is to choose a small set of features that are sufficiently informative while keeping the model simple and generalizable. We employ two established models to conduct the analysis: one is linear regression which targets for the most relevant features and the other is logistic linear regression which targets for the most discriminative features. We find the most informative features are comprised of those derived from diverse sources (ASR decoding, duration and lexical properties) and the two models deliver highly consistent feature ranks. STD experiments on both English and Spanish data demonstrate significant performance gains with the proposed feature sets.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 28, Issue 5, September 2014, Pages 1083–1114

نویسندگان

Javier Tejedor, Doroteo T. Toledano, Dong Wang, Simon King, José Colás,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : تجزیه و تحلیل ویژگی برای تخمین اعتماد به نفس تشخیصی در تشخیص گفتار

دسترسی سریع

ارتباط

English Website