دانلود رایگان مقاله: روش استخراج کلمات کلیدی گروهی و طبقه بندیها در طبقه بندی متن

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
381975	660712	2016	16 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Ensemble of keyword extraction methods and classifiers in text classification

ترجمه فارسی عنوان

روش استخراج کلمات کلیدی گروهی و طبقه بندیها در طبقه بندی متن

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

استخراج کلمات کلیدی؛ طبقه بندی متن؛ آموزش گروه؛ طبقه بندی متن علمی

Keyword Extraction - استخراج کلمات کلیدی Text classification - طبقه بندی متن Ensemble learning - یادگیری گروهی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی

پیش نمایش مقاله

روش استخراج کلمات کلیدی گروهی و طبقه بندیها در طبقه بندی متن

چکیده انگلیسی

• Text classification is a domain with high dimensional feature space.
• Extracting the keywords as the features can be extremely useful in text classification.
• An empirical analysis of five statistical keyword extraction methods.
• A comprehensive analysis of classifier and keyword extraction ensembles.
• For ACM collection, a classification accuracy of 93.80% with Bagging ensemble of Random Forest.

Automatic keyword extraction is an important research direction in text mining, natural language processing and information retrieval. Keyword extraction enables us to represent text documents in a condensed way. The compact representation of documents can be helpful in several applications, such as automatic indexing, automatic summarization, automatic classification, clustering and filtering. For instance, text classification is a domain with high dimensional feature space challenge. Hence, extracting the most important/relevant words about the content of the document and using these keywords as the features can be extremely useful. In this regard, this study examines the predictive performance of five statistical keyword extraction methods (most frequent measure based keyword extraction, term frequency-inverse sentence frequency based keyword extraction, co-occurrence statistical information based keyword extraction, eccentricity-based keyword extraction and TextRank algorithm) on classification algorithms and ensemble methods for scientific text document classification (categorization). In the study, a comprehensive study of comparing base learning algorithms (Naïve Bayes, support vector machines, logistic regression and Random Forest) with five widely utilized ensemble methods (AdaBoost, Bagging, Dagging, Random Subspace and Majority Voting) is conducted. To the best of our knowledge, this is the first empirical analysis, which evaluates the effectiveness of statistical keyword extraction methods in conjunction with ensemble learning algorithms. The classification schemes are compared in terms of classification accuracy, F-measure and area under curve values. To validate the empirical analysis, two-way ANOVA test is employed. The experimental analysis indicates that Bagging ensemble of Random Forest with the most-frequent based keyword extraction method yields promising results for text classification. For ACM document collection, the highest average predictive performance (93.80%) is obtained with the utilization of the most frequent based keyword extraction method with Bagging ensemble of Random Forest algorithm. In general, Bagging and Random Subspace ensembles of Random Forest yield promising results. The empirical analysis indicates that the utilization of keyword-based representation of text documents in conjunction with ensemble learning can enhance the predictive performance and scalability of text classification schemes, which is of practical importance in the application fields of text classification.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 57, 15 September 2016, Pages 232–247

نویسندگان

Aytuğ Onan, Serdar Korukoğlu, Hasan Bulut,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : روش استخراج کلمات کلیدی گروهی و طبقه بندیها در طبقه بندی متن

دسترسی سریع

ارتباط

English Website