کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
557984 1451694 2006 25 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A linguistically motivated approach to grapheme-to-phoneme conversion for Korean
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
پیش نمایش صفحه اول مقاله
A linguistically motivated approach to grapheme-to-phoneme conversion for Korean
چکیده انگلیسی

This paper describes a hand-written rule-based grapheme-to-phoneme (GTP) conversion system for Korean built within the Festival text-to-speech (TTS) synthesis framework. The core of the GTP conversion system is a simple implementation of nine linguistically motivated morphophonological rules. These rules, which are well known to students of Korean linguistics, were implemented in Festival rewrite formalism, and were applied to 1.3 million distinct orthographic words (space-delimited eojeols) from the Korean Newswire corpus. The outputs were evaluated against a representative subset of eojeols. The subset was examined by three native speakers of Korean, who judged 91.17% of the word types in a stratified sample of Korean eojeols to be acceptable pronunciations, which means that our system converted 99.63% of the grapheme tokens correctly. This performance is comparable to that obtained from earlier studies such as Kim et al. [Morpheme-based grapheme to phoneme conversion using phonetic patterns and morphophonemic connectivity information. ACM Transactions on Asian Language Information Processing 1 (1) (2002) 65–82] which, contrary to our system, used an elaborate morphological analysis module. This is evidence of the potential benefit of well-abstracted linguistic knowledge. In addition, because our approach is based on well-known linguistic principles, error analysis is fairly straightforward. Straightforward error analysis is an essential step in knowing what features are likely to be informative in training a hybrid system where exceptions to rules are handled by a machine-learning component.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 20, Issue 4, October 2006, Pages 357–381
نویسندگان
, ,