Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
559062 | Computer Speech & Language | 2012 | 15 Pages |
We present an overview of the data collection and transcription efforts for the COnversational Speech In Noisy Environments (COSINE) corpus. The corpus is a set of multi-party conversations recorded in real world environments with background noise. It can be used to train noise-robust speech recognition systems or develop speech de-noising algorithms. We explain the motivation for creating such a corpus, and describe the resulting audio recordings and transcriptions that comprise the corpus. These high quality recordings were captured in situ on a custom wearable recording system, whose design and construction is also described. On separate synchronized audio channels, seven-channel audio is captured with a 4-channel far-field microphone array, along with a close-talking, a monophonic far-field, and a throat microphone. This corpus thus creates many possibilities for speech algorithm research.
► We present the COnversational Speech In Noisy Environments (COSINE) corpus. ► COSINE consists of multi-party conversations recorded in noisy environments. ► The recordings were captured in situ on a custom wearable portable recording system. ► Seven separate heterogeneous synchronized audio channels have been captured. ► The corpus is useful for noise-robust ASR and speech de-noising algorithms.