Article ID Journal Published Year Pages File Type
6900929 Procedia Computer Science 2018 7 Pages PDF
Abstract
We present the analysis of approaches to solve an author gender identification task for Russian-language texts with gender deception, using different Data-Driven models based on conventional machine learning (Support Vector Classifier, Decision Tree, Gradient Boosting) and neuronet algorithms (convolutional layers, long short-term memory layers, etc.) The source of training and testing data are collections of texts from the Gender Imitation corpus, expanded by crowd-sourcing and supplemented with files of RusProfiling and RusPersonality corpora. The reached accuracy of this task milestone is presented and discussed.
Related Topics
Physical Sciences and Engineering Computer Science Computer Science (General)
Authors
, , , , , ,