Training speech translation from audio recordings of interpreter-mediated communication

Article ID	Journal	Published Year	Pages	File Type
558415	Computer Speech & Language	2013	20 Pages	PDF

Abstract

Globalization as well as international crises and disasters spur the need for cross-lingual verbal communication for myriad languages. This is reflected in ongoing intense research activity in the field of speech translation. However, the development of deployable speech translation systems still happens only for a handful of languages. Prohibitively high costs attached to the acquisition of sufficient amounts of suitable speech translation training data are one of the main reasons for this situation. A new language pair or domain is typically only considered for speech translation development after a major need for cross-lingual verbal communication just arose—justifying the high development costs. In such situations, communication has to rely on the help of interpreters, while massive data collections for system development are conducted in parallel. We propose an alternative to this time-consuming and costly parallel effort. By training speech translation directly on audio recordings of interpreter-mediated communication, we omit most of the manual transcription effort and all of the manual translation effort that characterizes traditional speech translation development.

► Rapid and cost-effective development of speech translation systems. ► Bootstrapping speech translation directly from interpreter-mediated conversations. ► Parallel speech audio: a novel resource for training automatic translation systems.

Keywords

Spoken language translation