کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
407547 678146 2015 13 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Dimensionality reduction for documents with nearest neighbor queries
ترجمه فارسی عنوان
کاهش ابعاد برای اسناد با نزدیکترین درخواست های همسایه
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی

Document collections are often stored as sets of sparse, high-dimensional feature vectors. Performing dimensionality reduction (DR) on such high-dimensional datasets for the purposes of visualization presents algorithmic and qualitative challenges for existing DR techniques. We propose the Q-SNE algorithm for dimensionality reduction of document data, combining the scalable probability-based layout approach of BH-SNE with an improved component to calculate approximate nearest neighbors, using the query-based APQ approach that exploits an impact-ordered inverted file. We provide thorough experimental evidence that Q-SNE yields substantial quality improvements for layouts of large document collections with commensurate speed. Our experiments were conducted with six real-world benchmark datasets that range up to millions of documents and terms, and compare against three alternatives for nearest neighbor search and five alternatives for dimensionality reduction.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Neurocomputing - Volume 150, Part B, 20 February 2015, Pages 557–569
نویسندگان
, ,