Towards precise and robust automatic synchronization of live speech and its transcripts

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
565974	875886	2011	16 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Hidden Markov models - مدل پنهان مارکوف

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

Towards precise and robust automatic synchronization of live speech and its transcripts

چکیده انگلیسی

This paper presents our efforts in automatically synchronizing spoken utterances with their transcripts (textual contents) (ASUT), where the speech is a live stream and its corresponding transcripts are known. This task is first simplified to the problem of online detecting the end times of spoken utterances and then a solution based on a novel frame-synchronous likelihood ratio test (FSLRT) procedure is proposed. We detail the formulation and implementation of the proposed FSLRT procedure under the Hidden Markov Models (HMMs) framework, and we study its property and parameter settings empirically.Because synchronization failures may occur in the FSLRT-based AUST systems, this paper also extends the FSLRT procedure to its multiple-instance version to increase the robustness of the system. The proposed multiple-instance FSLRT can detect the synchronization failures and restart the system from an appropriate point. Therefore a fully automatic FSLRT-based ASUT system could be constructed.The FSLRT-based ASUT system is evaluated in a simultaneous broadcasting news subtitling task. Experimental results show that the proposed method achieves satisfying performance and it outperforms an automatic speech recognition-based method both in terms of robustness and precision. Finally, the FSLRT-based news subtitling system can correctly subtitle about 90% of the sentences with an average time deviation of about 100 ms, running at the speed of 0.37 real time (RT).

Research highlights
► The problem of automatically synchronizing live speech with its transcripts is addressed.
► A novel frame-synchronous likelihood ratio test (FSLRT) procedure is proposed.
► The FSLRT is augmented with a multiple-instance strategy to increase its robustness.
► The proposed algorithm achieves satisfying performance in a simultaneous broadcasting news subtitling task.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 53, Issue 4, April 2011, Pages 508–523

نویسندگان

Jie Gao, Qingwei Zhao, Yonghong Yan,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Towards precise and robust automatic synchronization of live speech and its transcripts

دسترسی سریع

ارتباط

English Website