Video-based descriptors for object recognition

Article ID	Journal	Published Year	Pages	File Type
527048	Image and Vision Computing	2011	14 Pages	PDF

Abstract

We describe a visual recognition system operating on a hand-held device, based on a video-based feature descriptor, and characterize its invariance and discriminative properties. Feature selection and tracking are performed in real-time, and used to train a template-based classifier during a capture phase prompted by the user. During normal operation, the system recognizes objects in the field of view based on their ranking. Severe resource constraints have prompted a re-evaluation of existing algorithms improving their performance (accuracy and robustness) as well as computational efficiency. We motivate the design choices in the implementation with a characterization of the stability properties of local invariant detectors, and of the conditions under which a template-based descriptor is optimal. The analysis also highlights the role of time as “weak supervisor” during training, which we exploit in our implementation.

Graphical abstractFigure optionsDownload full-size imageDownload high-quality image (184 K)Download as PowerPoint slideHighlights► We analyze and derive representations of objects from video. ► We integrate multi-scale detection and tracking. ► We derive a video-based feature descriptor. ► We describe a visual recognition system for a hand-held device.

Keywords

Object recognition Feature tracking Visual recognition Mobile devices Active vision