Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
6900932 | Procedia Computer Science | 2018 | 8 Pages |
Abstract
In this paper we compare accuracies of solving the task of gender identification of RusPro-filing texts without gender deception on base of two types of data-driven modeling approaches: on the one hand, well-known conventional machine learning algorithms, such as Support Vector machine, Gradient Boosting; and, on the other hand, the set of Deep Learning neuronets, such as neuronet topologies with convolution, fully-connected, and Long Short-Term Memory layers, etc. The dependence of effectiveness of these models on the feature selection and on their representation is investigated. The obtained F1-score of 88% establishes the state of the art in the gender identification task with the RusProfiling corpus.
Related Topics
Physical Sciences and Engineering
Computer Science
Computer Science (General)
Authors
Alexander Sboev, Ivan Moloshnikov, Dmitry Gudovskikh, Anton Selivanov, Roman Rybka, Tatiana Litvinova,