کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
10132582 1645565 2018 30 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
NetClass: A network-based relational model for document classification
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
NetClass: A network-based relational model for document classification
چکیده انگلیسی
Aiming to handle the complexity inherent to the human textual communication, Automatic Document Classification (ADC) methods often adopt several simplifications. One such simplification is to consider independent the terms that compose documents, which may hide important relationships between them. These relationships can encapsulate non-trivial and effective patterns to improve classification effectiveness. In this work, we propose NetClass, a new network-based model for documents that explicitly considers term relationships and introduce a family of relational algorithms for ADC, such as the LRN-WRN classifier-a lazy relational ADC algorithm that not only exploits relationships between terms but also neighborhood information. As our extensive experimental evaluation shows, the proposed LRN-WRM achieves competitive performance when compared to the state-of-the-art in ADC, including SVM, considering seven distinct domains. More specifically, LRN-WRN outperforms state-of-the-art classifiers in 5 out of 7 domains, being within the top-2 best-performing classifier in all assessed domains. Our evaluation highlights the high effectiveness of our proposal, as well as its efficiency in terms of runtime. Indeed, besides effectiveness and efficiency, the simplicity and the absence of a complex parameter tuning of our proposal are key characteristics that make our algorithms interesting alternatives for ADC. Particularly, as highlighted by our experimental evaluation, LRN-WRM was shown to be a promising alternative to dynamic domains with a huge volume of short texts (e.g., social media content) or with several classes.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Sciences - Volume 469, December 2018, Pages 60-78
نویسندگان
, , , , , , ,