کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
566685 | 1452021 | 2016 | 13 صفحه PDF | دانلود رایگان |
• Relating MFCCS, wavelets, scattering transforms, and tract variables to phonology.
• Acoustic-articulatory inversion (AAI) with task dynamics.
• Comparison of sum-product and deep-belief networks on AAI.
• Comparison of AAI on speakers with and without cerebral palsy.
We provide the first direct comparison of sum-product networks (SPNs) and deep-belief networks on speech, and the first application of SPNs to acoustic-articulatory inversion. Interestingly, speech from individuals with cerebral palsy is reconstructed significantly more accurately across all manners of articulation using SPNs than when using DBNs. In order to select appropriate input parameters, we first compare MFCCs, wavelets, scattering coefficients, and vocal ‘tract variables’ as predictor variables to phonological features. Here, MFCCs provide for more accurate classification over a broad array of phonological categories (in the high 90s in many cases) than the other feature types. All experiments use the MOCHA-TIMIT and TORGO acoustic-articulatory databases.
Journal: Speech Communication - Volume 79, May 2016, Pages 61–73