کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
561402 | 875302 | 2012 | 16 صفحه PDF | دانلود رایگان |

In this article, we present LipSynch, a software tool that can be used for the automatic replacement of speech dialogues in motion pictures, video or television series. The system operates in two steps: during analysis, the timing relationships between the speech segments of the dialogues that serve as a timing reference and the corresponding speech segments in the replacement dialogues are measured by means of a split Dynamic Time Warping algorithm. The obtained warping paths are then processed and used to synthesize high-quality natural-sounding speech dialogues that are precisely time-synchronized with the reference dialogues. Subjective audio-visual listening tests performed within the context of a difficult Automatic Dialogue Replacement task demonstrated that LipSynch achieves a significant improvement compared to the industry-standard benchmark VocALign, both in terms of achieved lip-synchronization accuracy as well as in overall speech quality of the synthesized dialogues.
► We propose a new approach for automatic synchronization of two speech waveforms.
► We distinguish speech from non-speech to improve overall timing accuracy.
► Gradual time-scaling and speech rate control lead to more natural-sounding results.
► A software tool named LipSynch is proposed to perform time synchronization.
Journal: Signal Processing - Volume 92, Issue 2, February 2012, Pages 439–454