دانلود رایگان مقاله: یکپارچه سازی تقریبی نزولی و جاسازی ورد برای تبدیل ویژگی در طبقه بندی متن با ابعاد بزرگ

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
4960622	1446503	2017	10 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Integrating Low-rank Approximation and Word Embedding for Feature Transformation in the High-dimensional Text Classification

ترجمه فارسی عنوان

یکپارچه سازی تقریبی نزولی و جاسازی ورد برای تبدیل ویژگی در طبقه بندی متن با ابعاد بزرگ

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

تقریبی نزولی، درج کلمه تبدیل ویژگی، طبقه بندی متن،

word embedding - بستن کلمه Feature Transformation - تبدیل ویژگی Low-rank approximation - تقریبی نزولی Text classification - طبقه بندی متن

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)

پیش نمایش مقاله

یکپارچه سازی تقریبی نزولی و جاسازی ورد برای تبدیل ویژگی در طبقه بندی متن با ابعاد بزرگ

چکیده انگلیسی

With the Bag-of-Words model, a document corpus can be originally represented by a Terms-Documents matrix. However, the high-dimensional pure Terms-Documents matrix needs transforming to a lower-dimensional semantic Concepts-Documents matrix in order to not only reduce the feature space dimension but also create more meaningful features. This paper analyzes two feature transformation (FT) models on the Terms-Documents matrix, i.e. the FT model based on Low-Rank Approximation (LRA) and the FT model based on Word Embedding (WE). Both of them have their unique strength and weakness in the text transformation. The LRA-based FT only focuses on the mathematical perspective to statistically cover the original dispersed term set of the corpus as well as possible, while the WE-based FT utilizes the available word embedding vectors to enhance the contextual content of the corpus presentation. Therefore, the combinations of the LRA-based FT and the WE-based FT, named LRAintoWE-based FT and WEintoLRA-based FT, are possibly proposed to obtain comprehensive FTs capturing appropriately both the statistical information and the contextual information. The experiment results on three benchmark datasets show that the information of the WE-based FT and the LRA-based FT can be integrated, and their integration as LRAintoWE-based FT and WEintoLRA-based FT can improve the classification performance compared with that based on only either of them.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Procedia Computer Science - Volume 112, 2017, Pages 437-446

نویسندگان

Le Nguyen Hoai Nam, Ho Bao Quoc,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : یکپارچه سازی تقریبی نزولی و جاسازی ورد برای تبدیل ویژگی در طبقه بندی متن با ابعاد بزرگ

دسترسی سریع

ارتباط

English Website