کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
9651063 | 666434 | 2005 | 20 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
Exploiting concept clusters for content-based information retrieval
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
هوش مصنوعی
پیش نمایش صفحه اول مقاله

چکیده انگلیسی
Current approaches to index weighting for information retrieval from texts are based on statistical analysis of the texts' contents. A key shortcoming of these indexing schemes, which consider only the terms in a document, is that they cannot extract semantically exact indexes that represent the semantic content of a document. To address this issue, we proposed a new indexing formalism that considers not only the terms in a document, but also the concepts. In the proposed method, concepts are extracted by exploiting clusters of terms that are semantically related, referred to as concept clusters. Through experiments on the TREC-2 collection of Wall Street Journal documents, we show that the proposed method outperforms an indexing method based on term frequency (TF), especially in regard to the highest-ranked documents. Moreover, the index term dimension was 53.3% lower for the proposed method than for the TF-based method, which is expected to significantly reduce the document search time in a real environment.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Sciences - Volume 170, Issues 2â4, 25 February 2005, Pages 443-462
Journal: Information Sciences - Volume 170, Issues 2â4, 25 February 2005, Pages 443-462
نویسندگان
Bo-Yeong Kang, Dae-Won Kim, Sang-Jo Lee,