Article ID Journal Published Year Pages File Type
4960430 Procedia Computer Science 2017 12 Pages PDF
Abstract

Nowadays, there are many efforts in building intelligent humanoid robot and adding advanced ability such as Blind Speech Separation (BSS). BSS is a problem of separation of several speech signals in a real world from mono or stereo audio record. In this research, we implement BSP system using DUET algorithm which allow to separate any number of sources by using only stereo (two) mixtures. The DUET (Degenerate Unmixing Estimation Technique) algorithm replaces our previous FastICA (Fast Independent Component Analysis) method only success in simulation but failed in the implementation. The main problem of FastICA is that it assumes instantaneous mixing without time delay in the recording process. To deals with audio record in the presence of inevitable time delays, it has to be replaced with DUET algorithm to separate well in real time. Finally, the DUET algorithm is implemented to humanoid robot which is developed using Raspberry Pi and equipped with RaspPi Cam to detect human face. Furthermore, the Cirrus Logic Audio Card is stacked to Raspberry Pi in order to record stereo audio. In our experiments, there are three controlled variables to evaluate algorithm performance, that is: distance, number of sources, and subject's name. Robot will record stereo audio for four seconds after face is detected by system. The recording is then separated by DUET algorithm and produce two source estimations with average computation time 1.8 seconds. With Google API, the recognition accuracy of separated speech is varying between 40%-70%.

Related Topics
Physical Sciences and Engineering Computer Science Computer Science (General)
Authors
, , , , ,