A 3D deformable model-based framework for the retrieval of near-isometric flattenable objects using Bag-of-Visual-Words

Article ID	Journal	Published Year	Pages	File Type
6937437	Computer Vision and Image Understanding	2018	42 Pages	PDF

Abstract

We introduce a 3D deformable model-based framework for the retrieval of near-isometric flattenable objects using keypoints and BoVW (Bag-of-Visual-Words). By 3D deformable model we mean a texturemapped 3D shape which may deform isometrically. We assume that such a model is available for each object in the database. We exploit the 3D deformable models at the training and the retrieval phases. For our first contribution, we exploit the possibility of generating synthetic data from the 3D deformable models to define a new BoVW model for the database object representation. Our model chooses an optimal per-object representation by maximizing each object's mean average precision. The maximization is done over multiple candidate representations which are generated using the criteria of keypoint repeatability, weight discriminance and stability. Our second contribution is the use of SfT (Shape-from-Template) to facilitate geometric verification at the retrieval phase, for a few objects hypothesized using the new BoVW model. Existing methods use a rigid model, such as the fundamental matrix, or a simple deformable model based on semi-local constraints. SfT however is a physics-based method which uses an object's 3D deformable model to reconstruct its isometric 3D deformation from a single input image. The output of SfT thus directly provides a geometric verification score. A byproduct of our work is to extend the scope of SfT. The proposed object retrieval framework is used to provide SfT with a few object hypotheses which may be quickly tested for the 3D deformable object selection. Performance evaluation on synthetic and real images reveals the benefits of our retrieval framework using a database with size varying between 20 and 1000 objects. The use of the new BoVW model and SfT versus the BoVW baseline and a rigid model improves the retrieval performance by 4.2% and 11.3% with p-values of 5Ã10â6 and 7Ã10â30 respectively.

Keywords

Isometry Bag-of-Visual-Words