Article ID Journal Published Year Pages File Type
558524 Computer Speech & Language 2009 20 Pages PDF
Abstract

Linguistic rules have been assumed to be the best technique for determining the syllabification of unknown words. This has recently been challenged for the English language where data-driven algorithms have been shown to outperform rule-based methods. It may be possible, however, that data-driven methods are only better for languages with complex syllable structures. In this study, three rule-based automatic syllabification systems and two data-driven automatic syllabification systems (Syllabification by Analogy and the Look-Up Procedure) are compared on a language with lower syllabic complexity – Italian. Comparing the performance using a lexicon containing 44,720 words, the best data-driven algorithm (Syllabification by Analogy) achieved 97.70% word accuracy while the best rule set correctly syllabified 89.77% words. These results show that data-driven methods can also outperform rule-based methods on Italian syllabification, a language of low syllabic complexity.

Related Topics
Physical Sciences and Engineering Computer Science Signal Processing
Authors
, , ,