Article ID Journal Published Year Pages File Type
566076 Speech Communication 2011 19 Pages PDF
Abstract

This paper focuses on foreign accent characterisation and identification in French. How many accents may a native French speaker recognise and which cues does (s)he use? Our interest concentrates on French productions stemming from speakers of six different mother tongues: Arabic, English, German, Italian, Portuguese and Spanish, also compared with native French speakers (from the Île-de-France region). Using automatic speech processing, our objective is to identify the most reliable acoustic cues distinguishing these accents, and to link these cues with human perception. We measured acoustic parameters such as duration and voicing for consonants, the first two formant values for vowels, word-final schwa-related prosodic features and the percentages of confusions obtained using automatic alignment including non-standard pronunciation variants. Machine learning techniques were used to select the most discriminant cues distinguishing different accents and to classify speakers according to their accents. The results obtained in automatic identification of the different linguistic origins under investigation compare favourably to perceptual data. Major identified accent-specific cues include the devoicing of voiced stop consonants, /b/ ∼/v/ and /s / ∼/z/ confusions, the “rolled r” and schwa fronting or raising. These cues can contribute to improve pronunciation modeling in automatic speech recognition of accented speech.

Research highlights► Six foreign accents were investigated from perception, production and automatic processing viewpoints. ► Perceptual tests on read and spontaneous speech achieved at best 60% (including native French speakers). ► Listeners reported approximately 20 salient acoustic and prosodic cues (e.g. various pronunciations of “r”). ► Cues were defined and acoustic analyses performed using forced speech alignment on hours of speech. ► Making use of data mining techniques, automatic classification with 87 cues yielded at best 74% correct rates.

Related Topics
Physical Sciences and Engineering Computer Science Signal Processing
Authors
, , ,