دانلود رایگان مقاله: به طور صحیح طبقه بندی متون کوتاه با نمایش ساختاری اسپارتی با فیلتر کردن فرهنگ لغت

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
391987	664584	2015	13 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Effectively classifying short texts by structured sparse representation with dictionary filtering

ترجمه فارسی عنوان

به طور صحیح طبقه بندی متون کوتاه با نمایش ساختاری اسپارتی با فیلتر کردن فرهنگ لغت

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

طبقه بندی متن کوتاه، نمایندگی انحصاری، گروه اسپارتی، فیلتر کردن فرهنگ لغت

Group sparsity - اسپارتی گروه Sparse representation - نمایندگی انحصاری

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی

پیش نمایش مقاله

به طور صحیح طبقه بندی متون کوتاه با نمایش ساختاری اسپارتی با فیلتر کردن فرهنگ لغت

چکیده انگلیسی

• Structured sparsity is introduced to STC, which solves the sparse feature problem.
• A more compact dictionary is constructed to reduce data correlation and redundancy.
• The new dictionary boosts both classification performance and efficiency.
• Experiments over 5 corpora show that our method outperforms traditional STC methods.
• Experiment also shows that our method is better in exploiting external sources.

Short text classification (STC) has attracted increasing interest recently with the rapid growth of Web and social media data existing in short text form. It is a more challenging task than traditional text classification (TC) because of the feature sparsity of the processed short texts, which makes the state of the art TC approaches perform poorly on short texts if being applied straightforwardly. Existing STC approaches deal with the sparse problem mainly by enriching text content with outer corpora or additional information. Though better performance can be obtained, the performance heavily relies on the amount and quality of outer or additional information. What is worse, such outer or additional information is not always available, not to mention the high cost for acquiring such information. In this paper, we introduce a structured sparse representation classifier to effectively classify short texts, and develop an effective approach called convex hull vertices selection to reduce data correlation and redundancy of the dictionary (the set of training texts), which thus substantially boosts STC efficiency and performance. To the best of our knowledge, this is the first work that exploits structured sparsity for STC. Experiments over five datasets show that the proposed approach outperforms the state of the art TC methods in classification effectiveness and the traditional SR classifier in both classification effectiveness and classification efficiency. Furthermore, we carry out an experiment to classify short texts expanded by additional content, which indirectly shows that our approach performs better than the existing STC methods that exploit external text sources.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Sciences - Volume 323, 1 December 2015, Pages 130–142

نویسندگان

Longwen Gao, Shuigeng Zhou, Jihong Guan,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : به طور صحیح طبقه بندی متون کوتاه با نمایش ساختاری اسپارتی با فیلتر کردن فرهنگ لغت

دسترسی سریع

ارتباط

English Website