Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
411306 | Robotics and Autonomous Systems | 2014 | 17 Pages |
Enhancing perception of the local environment with semantic information like the room type is an important ability for agents acting in their environment. Such high-level knowledge can reduce the effort needed for, for example, object detection. This paper shows how to extract the room label from a small amount of room percepts taken from a certain view point (like the door frame when entering the room). Such functionality is similar to the human ability to get a scene impression from a quick glance. We propose a new three-dimensional (3D) spatial feature vector that captures the layout of a scene from extracted planar surfaces. The trained models emulate the human brain sensitivity to the 3D geometry of a room. Further, we show that our descriptor complements the information encoded by the Gist feature vector — a first attempt to model the mentioned brain area. The global scene properties are extracted from edge information in 2D depictions of the scene. Both features can be fused, resulting in a system that follows our goal to combine psychological insights on human scene perception with physical properties of environments. This paper provides detailed insights into the nature of our spatial descriptor.
► We present a new 3D feature vector capturing the spatial layout of indoor scenes. ► The descriptor is based on characteristics of planar surfaces and ► It is inspired by human brain areas that are sensitive to the 3D geometry of scenes. ► It allows robots to categorize their local environment quickly. ► We provide a detailed analysis of the descriptor’s scene recognition performance.