Indoor scene understanding via monocular RGB-D images

Article ID	Journal	Published Year	Pages	File Type
392022	Information Sciences	2015	11 Pages	PDF

Abstract

Scene understanding, as a high level cognition function of human beings, has long been a difficult task to replicate in computer vision and machine learning. Much research has attempted to address the issue, but it remains a great challenge: the process of scene understanding includes various aspects in both top-down and bottom-up processing in the human cognitive system. The deficiencies of 2D data have been proven, especially with regard to indoor scenes, because of their complexities and researchers are now focusing on 3D data. In this paper, we present an approach to indoor scene understanding based on monocular RGB-D images. We explore significant features and structure information from depth images, assess their application to indoor scenes, and integrate them into the whole process of indoor scene understanding with conventional 2D information. Additionally, we conceptualize indoor scene understanding as a global optimization framework comprising: segmentation, support inference, multi-object recognition and scene classification. Experiments demonstrate that our approach significantly outperforms state-of-the-art algorithms in all these sub-tasks.

Keywords

RGB-D Global optimization