کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
6873082 | 1440628 | 2018 | 16 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
Benchmarking top-k keyword and top-k document processing with T2K2 and T2K2D2
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله

چکیده انگلیسی
Top-k keyword and top-k document extraction are very popular text analysis techniques. Top-k keywords and documents are often computed on-the-fly, but they exploit weighted vocabularies that are costly to build. To compare competing weighting schemes and database implementations, benchmarking is customary. To the best of our knowledge, no benchmark currently addresses these problems. Hence, in this paper, we present T2K2, a top-k keywords and documents benchmark, and its decision support-oriented evolution T2K2D2. Both benchmarks feature a real tweet dataset and queries with various complexities and selectivities. They help evaluate weighting schemes and database implementations in terms of computing performance. To illustrate our benchmarks' relevance and genericity, we successfully ran performance tests on the TF-IDF and Okapi BM25 weighting schemes, on one hand, and on different relational (Oracle, PostgreSQL) and document-oriented (MongoDB) database implementations, on the other hand.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Future Generation Computer Systems - Volume 85, August 2018, Pages 60-75
Journal: Future Generation Computer Systems - Volume 85, August 2018, Pages 60-75
نویسندگان
Ciprian-Octavian TruicÄ, Jérôme Darmont, Alexandru Boicea, Florin RÄdulescu,