Article ID Journal Published Year Pages File Type
6900932 Procedia Computer Science 2018 8 Pages PDF
Abstract
In this paper we compare accuracies of solving the task of gender identification of RusPro-filing texts without gender deception on base of two types of data-driven modeling approaches: on the one hand, well-known conventional machine learning algorithms, such as Support Vector machine, Gradient Boosting; and, on the other hand, the set of Deep Learning neuronets, such as neuronet topologies with convolution, fully-connected, and Long Short-Term Memory layers, etc. The dependence of effectiveness of these models on the feature selection and on their representation is investigated. The obtained F1-score of 88% establishes the state of the art in the gender identification task with the RusProfiling corpus.
Related Topics
Physical Sciences and Engineering Computer Science Computer Science (General)
Authors
, , , , , ,