Article ID Journal Published Year Pages File Type
6938058 Journal of Visual Communication and Image Representation 2018 11 Pages PDF
Abstract
We study the problem of 3D object detection from RGB-D images so as to achieve localization (i.e., producing a bounding box around the object) and classification (i.e., determining the object category) simultaneously. Its challenges arise from high intra-class variability, illumination change, background clutter and occlusion. To solve this problem, we propose a novel solution that integrates the 2D information (RGB images), the 3D information (RGB-D images) and the object/scene context information together, and call it the Context-Assisted 3D (C3D) method. In the proposed C3D method, we first use a convolutional neural network (CNN) to jointly detect a 3D object in a scene and its scene category. Then, we improve the detection result furthermore with a Conditional Random Field (CRF) model that incorporates the object potential, the scene potential, the scene/object context, the object/object context, and the room geometry. Extensive experiments are conducted to demonstrate that the proposed C3D method achieves the state-of-the-art performance for 3D object detection against the SUN RGB-D benchmark dataset.
Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , , ,