Article ID Journal Published Year Pages File Type
4970055 Pattern Recognition Letters 2017 10 Pages PDF
Abstract
Most existing methods of human attribute recognition are part-based, where features are extracted at human body parts corresponding to each human attribute and the part-based features are then fed to classifiers individually or together for recognizing human attributes. The performance of these methods is highly dependent on the accuracy of body-part detection, which is a well known challenging problem in computer vision. Different from these part-based methods, we propose to recognize human attributes by using CAM (Class Activation Map) network and further improve the recognition by refining the attention heat map, which is an intermediate result in CAM and reflects relevant image regions for each attribute. The proposed method does not require the detection of body parts and the prior correspondence between body parts and attributes. In particular, we define a new exponential loss function to measure the appropriateness of the attention heat map. The attribute classifiers are further trained in terms of both the original classification loss function and this new exponential loss function. The proposed method is developed on an end-to-end CNN network with CAM, by adding a new component for refining attention heat map. We conduct experiments on Berkeley Attributes of Human People Dataset and WIDER Attribute Dataset. The proposed methods achieve comparable performance of attribute recognition to the current state-of-the-art methods.
Keywords
Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , ,