کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
566664 | 1452019 | 2016 | 12 صفحه PDF | دانلود رایگان |
• Phase information based synthetic speech detectors (RPS, MGD) are analyzed.
• Training using real attack samples and copy-synthesized material is evaluated.
• Evaluation of the detectors against unknown attacks, including channel effect.
• Detectors work well for voice conversion and adapted synthetic speech impostors.
Taking advantage of the fact that most of the speech processing techniques neglect the phase information, we seek to detect phase perturbations in order to prevent synthetic impostors attacking Speaker Verification systems. Two Synthetic Speech Detection (SSD) systems that use spectral phase related information are reviewed and evaluated in this work: one based on the Modified Group Delay (MGD), and the other based on the Relative Phase Shift, (RPS). A classical module-based MFCC system is also used as baseline. Different training strategies are proposed and evaluated using both real spoofing samples and copy-synthesized signals from the natural ones, aiming to alleviate the issue of getting real data to train the systems. The recently published ASVSpoof2015 database is used for training and evaluation. Performance with completely unrelated data is also checked using synthetic speech from the Blizzard Challenge as evaluation material. The results prove that phase information can be successfully used for the SSD task even with unknown attacks.
Journal: Speech Communication - Volume 81, July 2016, Pages 30–41