Article ID Journal Published Year Pages File Type
754316 Applied Acoustics 2015 12 Pages PDF
Abstract

A method is presented to personalize HRTFs based on anthropometric features obtained from digital portraits of a subject through computer vision, using the active shape models (ASM) algorithm. The method presented here is automatic, advancing previous work, and uses a simple more efficient 2D image model to represent the subject. The ASM is trained beforehand on a data set of manually annotated images of other subjects, and then used to automatically recognize anthropometric features of the ears, head, and torso, on photographs of the given subject. Anthropometric parameters in metric linear units are obtained from the image model as a function of camera settings, field of view, and distance to the subject. Anthropometric data are finally used to personalize HRTFs for the subject, by inference from the CIPIC HRTF database, containing anthropometric and HRTF data for a number of subjects, using the method of best anthropometric match selection. Results are analyzed, concluding that personalization of HRTFs can be achieved by automatic photo-anthropometry through computer vision algorithms, and inference of HRTFs from a database. Results were evaluated using two definitions of acoustic spectral distortion, as comparative measures of performance: one in magnitude only, previously proposed, and a novel one in magnitude and phase. However, evidence is also presented suggesting that these performance metrics need to be improved further, and that current anthropometric and HRTF databases might still be insufficient to provide more satisfactory results, pointing out to the need for more research in this area.

Related Topics
Physical Sciences and Engineering Engineering Mechanical Engineering
Authors
, , ,