دانلود رایگان مقاله: یکپارچه سازی متا اطلاعات در مدل های مداوم شبکه های عصبی شبکه

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
565888	1452027	2015	17 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Integrating meta-information into recurrent neural network language models

ترجمه فارسی عنوان

یکپارچه سازی متا اطلاعات در مدل های مداوم شبکه های عصبی شبکه

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

شبکه عصبی مکرر، مدل های زبان، بخشی از سخنرانی، محیط اجتماعی-موقعیتی، موضوع، طول رأی

Part of Speech - بخشی از سخنرانی Recurrent neural networks - شبکه های عصبی راجعه Sentence length - طول رأی Language models - مدل های زبان Topic - موضوع

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش مقاله

یکپارچه سازی متا اطلاعات در مدل های مداوم شبکه های عصبی شبکه

چکیده انگلیسی

• Recurrent neural network language models benefit from the integration of meta-information.
• Meta-information of various types, word-level, discourse-level, and intrinsic, is valuable.
• If meta-information can be robustly predicted, it has potential to improve performance.
• Intrinsic meta-information, word- and sentence-length, is trivial to obtain, but still useful.
• Approach is validated with the Spoken Dutch Corpus and the Wall Street Journal data set.

Due to their advantages over conventional n-gram language models, recurrent neural network language models (rnnlms) recently have attracted a fair amount of research attention in the speech recognition community. In this paper, we explore one advantage of rnnlms, namely, the ease with which they allow the integration of additional knowledge sources. We concentrate on features that provide complementary information w.r.t. the lexical identities of the words. We refer to such information as meta-information. We single out three cases and investigate their merits by means of N-best list re-scoring experiments on a challenging corpus of spoken Dutch (referred to as cgn) as well as on the English Wall Street Journal (wsj) corpus. First, we look at Parts of Speech (POS) tags and lemmas, two sources of word-level linguistic information that are known to make a contribution to the performance of conventional language models. We confirm that rnnlms can benefit from these sources as well. Second, we investigate socio-situational settings (ssss) and topics, two sources of discourse-level information that are also known to benefit language models. ssss are present in the cgn data, and can be seen as a proxy for the language register. For the purposes of our investigation, we assume that information on the sss can be captured at the moment at which speech is recorded. Topics, i.e., treatments of different subjects, are present in the wsj data. In order to predict POS, lemmas, sss and topic, a second rnnlm is coupled to the main rnnlm. We refer to this architecture as a recurrent neural network tandem language model (rnntlm). Our experimental findings show that if high-quality meta-information labels are available, both word-level and discourse-level information improve performance of language models. Third, we investigate sentence length and word length (i.e., token size), two sources of intrinsic information that are readily available for exploitation because they are known at the time of re-scoring. Intrinsic information has been largely overlooked by language modeling research. The results of both experiments on cgn data and wsj data show that integrating sentence length and word length can achieve improvement. rnnlms allow these features to be incorporated with ease, and obtain improved performance.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 73, October 2015, Pages 64–80

نویسندگان

Yangyang Shi, Martha Larson, Joris Pelemans, Catholijn M. Jonker, Patrick Wambacq, Pascal Wiggers, Kris Demuynck,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : یکپارچه سازی متا اطلاعات در مدل های مداوم شبکه های عصبی شبکه

دسترسی سریع

ارتباط

English Website