Implementation of Blind Speech Separation for Intelligent Humanoid Robot using DUET Method

Article ID	Journal	Published Year	Pages	File Type
4960430	Procedia Computer Science	2017	12 Pages	PDF

Abstract

Nowadays, there are many efforts in building intelligent humanoid robot and adding advanced ability such as Blind Speech Separation (BSS). BSS is a problem of separation of several speech signals in a real world from mono or stereo audio record. In this research, we implement BSP system using DUET algorithm which allow to separate any number of sources by using only stereo (two) mixtures. The DUET (Degenerate Unmixing Estimation Technique) algorithm replaces our previous FastICA (Fast Independent Component Analysis) method only success in simulation but failed in the implementation. The main problem of FastICA is that it assumes instantaneous mixing without time delay in the recording process. To deals with audio record in the presence of inevitable time delays, it has to be replaced with DUET algorithm to separate well in real time. Finally, the DUET algorithm is implemented to humanoid robot which is developed using Raspberry Pi and equipped with RaspPi Cam to detect human face. Furthermore, the Cirrus Logic Audio Card is stacked to Raspberry Pi in order to record stereo audio. In our experiments, there are three controlled variables to evaluate algorithm performance, that is: distance, number of sources, and subject's name. Robot will record stereo audio for four seconds after face is detected by system. The recording is then separated by DUET algorithm and produce two source estimations with average computation time 1.8 seconds. With Google API, the recognition accuracy of separated speech is varying between 40%-70%.

Keywords

Speech recognition DUET