Automatic text classification algorithm based on Gauss improved convolutional neural network

Article ID	Journal	Published Year	Pages	File Type
4950988	Journal of Computational Science	2017	22 Pages	PDF

Abstract

The traditional KNN query is a kind of algorithm with good stability and accuracy performance. However, when the sample size is too large, the computational efficiency of the algorithm is affected greatly. Therefore, a kind of parallel MKNN text classification algorithm based on clustering center text series has been proposed. Firstly, the effective dimensionality reduction of similarity calculation amount of the algorithm is realized based on the clustering center, and the original large-scale document samples are replaced with a relatively small number of clustering sample centers to realize improvement of the KNN query process. Secondly, MapReduce parallel framework is used to meet real-time demand of large-scale text classification and calculation combined with features of text classification, and to effectively overcome slow speed of the KNN query process and ensure accuracy of text classification as higher as possible. Finally, the classification speed of proposed algorithm can be effectively improved under the premise of ensuring sufficient accuracy through comparison in experiment of text classification accuracy and algorithmic efficiency with the similar single-threaded algorithm.

Keywords

Clustering algorithm Natural language Neural network Text classification Parallel computation