Article ID Journal Published Year Pages File Type
559062 Computer Speech & Language 2012 15 Pages PDF
Abstract

We present an overview of the data collection and transcription efforts for the COnversational Speech In Noisy Environments (COSINE) corpus. The corpus is a set of multi-party conversations recorded in real world environments with background noise. It can be used to train noise-robust speech recognition systems or develop speech de-noising algorithms. We explain the motivation for creating such a corpus, and describe the resulting audio recordings and transcriptions that comprise the corpus. These high quality recordings were captured in situ on a custom wearable recording system, whose design and construction is also described. On separate synchronized audio channels, seven-channel audio is captured with a 4-channel far-field microphone array, along with a close-talking, a monophonic far-field, and a throat microphone. This corpus thus creates many possibilities for speech algorithm research.

► We present the COnversational Speech In Noisy Environments (COSINE) corpus. ► COSINE consists of multi-party conversations recorded in noisy environments. ► The recordings were captured in situ on a custom wearable portable recording system. ► Seven separate heterogeneous synchronized audio channels have been captured. ► The corpus is useful for noise-robust ASR and speech de-noising algorithms.

Related Topics
Physical Sciences and Engineering Computer Science Signal Processing
Authors
, , , , ,