Deep Learning neural nets versus traditional machine learning in gender identification of authors of RusProfiling texts

Article ID	Journal	Published Year	Pages	File Type
6900932	Procedia Computer Science	2018	8 Pages	PDF

Abstract

In this paper we compare accuracies of solving the task of gender identification of RusPro-filing texts without gender deception on base of two types of data-driven modeling approaches: on the one hand, well-known conventional machine learning algorithms, such as Support Vector machine, Gradient Boosting; and, on the other hand, the set of Deep Learning neuronets, such as neuronet topologies with convolution, fully-connected, and Long Short-Term Memory layers, etc. The dependence of effectiveness of these models on the feature selection and on their representation is investigated. The obtained F1-score of 88% establishes the state of the art in the gender identification task with the RusProfiling corpus.

Keywords

Neural networks Gender identification Data-driven modeling Natural Language Processing