Building personalised synthetic voices for individuals with severe speech impairment

Article ID	Journal	Published Year	Pages	File Type
559031	Computer Speech & Language	2013	16 Pages	PDF

Abstract

For individuals with severe speech impairment accurate spoken communication can be difficult and require considerable effort. Some may choose to use a voice output communication aid (or VOCA) to support their spoken communication needs. A VOCA typically takes input from the user through a keyboard or switch-based interface and produces spoken output using either synthesised or recorded speech. The type and number of synthetic voices that can be accessed with a VOCA is often limited and this has been implicated as a factor for rejection of the devices. Therefore, there is a need to be able to provide voices that are more appropriate and acceptable for users.This paper reports on a study that utilises recent advances in speech synthesis to produce personalised synthetic voices for 3 speakers with mild to severe dysarthria, one of the most common speech disorders. Using a statistical parametric approach to synthesis, an average voice trained on data from several unimpaired speakers was adapted using recordings of the impaired speech of 3 dysarthric speakers. By careful selection of the speech data and the model parameters, several exemplar voices were produced for each speaker. A qualitative evaluation was conducted with the speakers and listeners who were familiar with the speaker. The evaluation showed that for one of the 3 speakers a voice could be created which conveyed many of his personal characteristics, such as regional identity, sex and age.

► We built personalised synthetic voices for 3 individuals with dysarthria. ► We evaluated output similarity to speaker, quality and appropriateness for speaker. ► Impaired speech can be reconstructed using statistical parametric speech synthesis. ► Output will improve with more appropriate starting points for adaptation of data.

Keywords

Augmentative and alternative communication Speech synthesis