Context-Assisted 3D (C3D) Object Detection from RGB-D Images

Article ID	Journal	Published Year	Pages	File Type
6938058	Journal of Visual Communication and Image Representation	2018	11 Pages	PDF

Abstract

We study the problem of 3D object detection from RGB-D images so as to achieve localization (i.e., producing a bounding box around the object) and classification (i.e., determining the object category) simultaneously. Its challenges arise from high intra-class variability, illumination change, background clutter and occlusion. To solve this problem, we propose a novel solution that integrates the 2D information (RGB images), the 3D information (RGB-D images) and the object/scene context information together, and call it the Context-Assisted 3D (C3D) method. In the proposed C3D method, we first use a convolutional neural network (CNN) to jointly detect a 3D object in a scene and its scene category. Then, we improve the detection result furthermore with a Conditional Random Field (CRF) model that incorporates the object potential, the scene potential, the scene/object context, the object/object context, and the room geometry. Extensive experiments are conducted to demonstrate that the proposed C3D method achieves the state-of-the-art performance for 3D object detection against the SUN RGB-D benchmark dataset.

Keywords

conditional random field Convolutional neural network