Real-time estimation of 3D scene geometry from a single image

Article ID	Journal	Published Year	Pages	File Type
533373	Pattern Recognition	2012	14 Pages	PDF

Abstract

Significant advances have recently been made in the development of computational methods for predicting 3D scene structure from a single monocular image. However, their computational complexity severely limits the adoption of such technologies to various computer vision and pattern recognition applications. In this paper, we address the problem of inferring 3D scene geometry from a single monocular image of man-made environments. Our goal is to estimate the 3D structure of a scene in real-time with a level of accuracy useful in certain real applications. Towards this end, we decompose the three-dimensional world space into a set of geometrically inspired primitive subspaces. One important advantage of our approach is that the complex estimation problem can be systematically broken down into a sequence of subproblems, which are easier to solve and more reliable even with the presence of occlusion or clutter, without loss of generality. The proposed algorithm also serves as the technical foundation for effective representation of the 3D scene geometry based on a simple description of the textural patterns present in the image and their spatial arrangement. Extensive experiments have been conducted on a large scale challenging dataset of real-world images. Our results demonstrate that the proposed method remarkably outperforms the recent state-of-the-art algorithms with respect to speed and accuracy.

► We address the problem of inferring 3D scene geometry from a single monocular image. ► Our goal is to estimate the 3D structure of a scene in real-time with substantial accuracy. ► The breakthrough originates from a discovery of a new modeling paradigm. ► The proposed method remarkably outperforms the state-of-the-arts with respect to speed and accuracy.

Keywords

Depth information Monocular vision Machine learning