کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
433949 1441628 2016 25 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
TextFlows: A visual programming platform for text mining and natural language processing
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
TextFlows: A visual programming platform for text mining and natural language processing
چکیده انگلیسی


• New TextFlows workflow management web platform for text mining and NLP was developed.
• A survey and detailed TextFlows comparison to five other NLP platforms is provided.
• Enables simple evaluation of algorithms from NLTK, LATINO and scikit-learn libraries.
• LATINO's Max Entropy classifier achieves best results in document categorization.
• Part-Of-Speech tagging improves the accuracy of document classification.

Text mining and natural language processing are fast growing areas of research, with numerous applications in business, science and creative industries. This paper presents TextFlows, a web-based text mining and natural language processing platform supporting workflow construction, sharing and execution. The platform enables visual construction of text mining workflows through a web browser, and the execution of the constructed workflows on a processing cloud. This makes TextFlows an adaptable infrastructure for the construction and sharing of text processing workflows, which can be reused in various applications. The paper presents the implemented text mining and language processing modules, and describes some precomposed workflows. Their features are demonstrated on three use cases: comparison of document classifiers and of different part-of-speech taggers on a text categorization problem, and outlier detection in document corpora.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Science of Computer Programming - Volume 121, 1 June 2016, Pages 128–152
نویسندگان
, , , , ,