کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
6960582 | 1452001 | 2018 | 17 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
Intra-gender statistical singing voice conversion with direct waveform modification using log-spectral differential
ترجمه فارسی عنوان
تبدیل صدا صدای آماری درون جنسیت با تغییر شکل موج مستقیم با استفاده از دیفرانسیل طیفی ورودی
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
پردازش سیگنال
چکیده انگلیسی
This paper presents a novel intra-gender statistical singing voice conversion (SVC) technique with direct waveform modification based on the log-spectrum differential (DIFFSVC) that can convert the voice timbre of a source singer into that of a target singer without vocoder-based waveform generation of the converted singing voice. SVC makes it possible to convert the singing voice characteristics of an arbitrary source singer into those of an arbitrary target singer by converting some of its acoustic features, such as F0, aperiodicity, and spectral features based on a statistical conversion function. However, the sound quality of the converted singing voice is typically degraded compared with that of a natural singing voice, owing to various factors, such as analysis and modeling errors in the vocoding process and over-smoothing of the converted feature trajectory. To alleviate sound quality degradation, we propose a statistical conversion process that directly modifies the signal in the waveform domain by estimating the difference in the spectra of the source and target singers' singing voices. Additionally, we propose the following several techniques for the DIFFSVC method: 1) derivation of a differential Gaussian mixture model (DIFFGMM) from a conventional Gaussian mixture model (GMM) and 2) a parameter generation algorithm considering the global variance (GV). The experimental results demonstrate that the proposed DIFFSVC methods enable significant improvements in the sound quality of the converted singing voice, while preserving the conversion accuracy of the singer's identity compared with conventional SVC.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 99, May 2018, Pages 211-220
Journal: Speech Communication - Volume 99, May 2018, Pages 211-220
نویسندگان
Kazuhiro Kobayashi, Tomoki Toda, Satoshi Nakamura,