Article ID Journal Published Year Pages File Type
6864585 Neurocomputing 2018 37 Pages PDF
Abstract
In this paper, we propose a novel semi-supervised method to predict clothing attributes with the assistance of unlabeled data like fashion shows. To this end, a two-stage framework is built, i.e., the unsupervised triplet network pre-training stage that ensures frames in the same video having coherent representations while frames from different videos having larger feature distances, and a supervised clothing attribute prediction stage to estimate the value of attributes. Specifically, we first detect the clothes of frames in the collected 18,737 female fashion shows and 21,224 male fashion shows which contain no extra labels. Then a triplet neural network is constructed via embedding the temporal appearance consistency between frames in the same video and the representation gap in different videos. Finally, we transfer the triplet model parameters to multi-task clothing attribute prediction model, and fine-tune it with clothing images holding attribute labels. Extensive experiments demonstrate the advantages of the proposed method on two clothing datasets.
Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , , , ,