Hybrid approach for conceptual segmentation of spontaneous Arabic oral utterances

Article ID	Journal	Published Year	Pages	File Type
6902116	Procedia Computer Science	2017	8 Pages	PDF

Abstract

This work is a part of automatic spontaneous Arabic speech understanding. In this paper, we propose a hybrid method for the conceptual segmentation of spontaneous Arabic oral utterances. Our method is composed of two main parts. The symbolic part is for the conceptual segmentation and the disfluencies processing (simple and complex). The numerical part is based on the machine learning to extract rules allowing the Out-Of-Vocabulary (OOV) words processing (unknown words and miss-recognized words). Also, we carried out the Arabic Oral Conceptual Segmentation Module (AOCSM) using the proposed method. The evaluation of AOCSM is done by evaluating each parts (symbolic and numerical) separately. The symbolic part was evaluated and compared with CSM [4]. The numerical part was evaluated through the literal understanding module of SARF system [2] using a comparative study that will involve the results generated by the two versions of the literal understanding module (i.e., literal understanding module of SARF system with and without AOCSM). The evaluation results are encouraging. The first evaluation showed an improvement in the F-measure rate of 5,12% relative to the MSC and the second evaluation showed that the acceptable understanding of the SARF system has increased by 9.31% after the integration of our AOCSM.

Keywords

Disfluencies Hybrid approach