کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
558524 | 874946 | 2009 | 20 صفحه PDF | دانلود رایگان |
Linguistic rules have been assumed to be the best technique for determining the syllabification of unknown words. This has recently been challenged for the English language where data-driven algorithms have been shown to outperform rule-based methods. It may be possible, however, that data-driven methods are only better for languages with complex syllable structures. In this study, three rule-based automatic syllabification systems and two data-driven automatic syllabification systems (Syllabification by Analogy and the Look-Up Procedure) are compared on a language with lower syllabic complexity – Italian. Comparing the performance using a lexicon containing 44,720 words, the best data-driven algorithm (Syllabification by Analogy) achieved 97.70% word accuracy while the best rule set correctly syllabified 89.77% words. These results show that data-driven methods can also outperform rule-based methods on Italian syllabification, a language of low syllabic complexity.
Journal: Computer Speech & Language - Volume 23, Issue 4, October 2009, Pages 444–463