کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
383564 660826 2016 15 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Hybrid linear matrix factorization for topic-coherent terms clustering
ترجمه فارسی عنوان
عامل‌بندی ماتریس خطی ترکیبی برای خوشه‌بندی شرایط موضوع منسجم
کلمات کلیدی
عامل‌بندی ماتریس؛ کاهش ابعاد بزرگ. خوشه بندی شرایط ؛
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی


• We propose a novel Karhunen–Loève Transformation (KLT) for dimension reduction.
• Karhunen–Loève expansion based on Wiener process on KLT results for optimization.
• State-of-the-art topic-coherence metrics are used for word clustering and evaluation.

Topic-coherent term clustering is the foundation of document organization, corpus summarization and document classification. It is especially useful in solving the emerging problem of big data. However, a term clustering method that can cope with high-dimension data with variable length and topics and meanwhile achieve high topic coherence is an ongoing request. It is a challenging problem in research. This paper proposes a hybrid linear matrix factorization method to identify the topic-coherent terms from documents to form a thesaurus for clustering. Starting from an analog Karhunen–Loève transformation from PCA scores fully into FA's factor coefficients space (loadings), the high-dimension of the full set of PCA scores is reduced and topic-coherent terms are classified by the main factors of FA which could be topics. Karhunen–Loève transformation reduces the total mean square error to increase topic coherence. The optimization of the initial transformation is carried out further in a manner of Karhunen–Loève expansion based on stochastic Wiener process. The optimal topic coherent bags of terms are found to build a more topic-coherent model. This approach is experimented on the CISI, MedSH and Tweets dataset in different sizes and number of topics. It achieves outstanding results better than the methods in comparison.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 62, 15 November 2016, Pages 358–372
نویسندگان
, ,