کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
533805 870167 2015 7 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Learning to classify gender from four million images
ترجمه فارسی عنوان
یادگیری طبقه بندی جنسیت از چهار میلیون عکس
کلمات کلیدی
اطلاعات بزرگ، طبقه بندی جنسیتی، آموزش آنلاین
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
چکیده انگلیسی


• We automatically assemble a big dataset to train a face gender classifier.
• This is formed by 4 million images and over 60,000 features.
• The resulting system significantly outperforms the previous state of the art without human annotation.
• This study lends support to the “unreasonable effectiveness of data” conjecture.
• This study is relevant to computer vision (LBP features, face classification), machine learning (large scale linear classifiers), and big data.
• This study can serve as a template for other “web scale” learning tasks.

The application of learning algorithms to big datasets has been identified for a long time as an effective way to attack important tasks in pattern recognition, but the generation of large annotated datasets has a significant cost. We present a simple and effective method to generate a classifier of face images, by training a linear classification algorithm on a massive dataset entirely assembled and labelled by automated means. In doing so, we perform the largest experiment on face gender recognition so far published, reporting the highest performance yet. Four million images and more than 60,000 features are used to train online classifiers. By using an ensemble of linear classifiers, we achieve an accuracy of 96.86% on the most challenging public database, labelled faces in the wild (LFW), 2.05% higher than the previous best result on the same dataset (Shan, 2012). This result is relevant both for the machine learning community, addressing the role of large datasets, and the computer vision community, providing a way to make high quality face gender classifiers. Furthermore, we propose a general way to generate and exploit massive data without human annotation. Finally, we demonstrate a simple and effective adaptation of the Pegasos that makes it more robust.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 58, 1 June 2015, Pages 35–41
نویسندگان
, ,