کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
558223 | 1451691 | 2016 | 12 صفحه PDF | دانلود رایگان |
کلمات کلیدی
1.مقدمه
شکل-1. نمایش روش پیش پردازش پیشنهادی برای تشخیص گفتار سالمندان.
2. پژوهش های پیشین
3. ویژگی های صوتی سالمندان و روش استخراج معمولی
شکل-2. نرخ دقت تشخیص گفتار (%).
4. آزمایشات و نتایج
4.1 تجزیه و تحلیل مقایسه ای از نشانه های سخنرانی در میان بزرگسالان مسن و جوان
شکل 3. میانگین نرخ گفتار ( در واحد ثانیه).
شکل 4. میانگین طول سکوت ( در واحد ثانیه).
جدول-1. توزیع فراوانی واکه اول هر کلمه
جدول-2. نسبت اهمیت مقادیر p در فرکانس های سازند براساس جنس.
جدول-3. اهمیت واکه های اصلی برای هر سازنده براساس جنس ( مقدار p)
4.2 نتایج پیش پردازش و تشخیص گفتار سالمندان
شکل 5. تغییر در فضای واکه F1-F2 به صورت تابعی از سن و جنس.
شکل 6. نسبت نسبی انرژی باند فراوانی واکه (%).
جدول-4. اهمیت سطح نرخ انرژی باند فرکانس سازند نسبی.
شکل 7. میانگین نرخ دقت تشخیص گفتار پس از افزایش نرخ گفتار.
شکل 8. نرخ دقت تشخیص گفتار (%) زن های سالمند پس از حذف سکوت.
شکل 9. نرخ دقت تشخیص گفتار (%) زن های سالمند پس از تنظیم RF2E و RF3E.
شکل 10. میزان دقت تشخیص گفتار (٪) با داده های گفتاری اضافی.
5. نتیجه گیری
• Preprocessed elderly voice signals were tested with an android smart phone.
• Speech recognition accuracy increased to 1.5% by increasing the speech rate.
• Speech recognition accuracy increased to 4.2% by eliminating intersyllabic pauses.
• Speech recognition accuracy increased to 6% by boosting formant frequency bands.
• After all the preprocessing, 12% increase in the recognition accuracy was achieved.
Due to the increasing aging population in modern society and to the proliferation of smart devices, there is a need to enhance speech recognition among smart devices in order to make information easily accessible to the elderly as it is to the younger population. In general, speech recognition systems are optimized to an average adult's voice and tend to exhibit a lower accuracy rate when recognizing an elderly person's voice, due to the effects of speech articulation and speaking style. Additional costs are bound to be incurred when adding modifications to current speech recognitions systems for better speech recognition among elderly users. Thus, using a preprocessing application on a smart device can not only deliver better speech recognition but also substantially reduce any added costs. Audio samples of 50 words uttered by 80 elderly and young adults were collected and comparatively analyzed. The speech patterns of the elderly have a slower speech rate with longer inter-syllabic silence length and slightly lower speech intelligibility. The speech recognition rate for elderly adults could be improved by means of increasing the speech rate, adding a 1.5% increase in accuracy, eliminating silence periods, adding another 4.2% increase in accuracy, and boosting the energy of the formant frequency bands for a 6% boost in accuracy. After all the preprocessing, a 12% increase in the accuracy of elderly speech recognition was achieved. Through this study, we show that speech recognition of elderly voices can be improved through modifying specific aspects of differences in speech articulation and speaking style. In the future, we will conduct studies on methods that can precisely measure and adjust speech rate and find additional factors that impact intelligibility.
Journal: Computer Speech & Language - Volume 36, March 2016, Pages 110–121