کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
379332 659291 2007 17 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
The phrase-based vector space model for automatic retrieval of free-text medical documents
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
The phrase-based vector space model for automatic retrieval of free-text medical documents
چکیده انگلیسی

ObjectiveTo develop a document indexing scheme that improves the retrieval effectiveness for free-text medical documents.DesignThe phrase-based vector space model (VSM) uses multi-word phrases as indexing terms. Each phrase consists of a concept in the unified medical language system (UMLS) and its corresponding component word stems. The similarity between concepts are defined by their relations in a hypernym hierarchy derived from UMLS. After defining the similarity between two phrases by their stem overlaps and the similarity between the concepts they represent, we define the similarity between two documents as the cosine of the angle between their corresponding phrase vectors. This paper reports the development and the validation of the phrase-based VSM.MeasurementWe compare the retrieval effectiveness of different vector space models using two standard test collections, OHSUMED and Medlars. OHSUMED contains 105 queries and 14,430 documents, and Medlars contains 30 queries and 1033 documents. Each document in the test collections is judged by human experts to be either relevant or non-relevant to each query. The retrieval effectiveness is measured by precision and recall.ResultsThe phrase-based VSM is significantly more effective than the current gold standard—the stem-based VSM. Such significant retrieval effectiveness improvements are observed in both the exhaustive search and cluster-based document retrievals.ConclusionThe phrase-based VSM is a better indexing scheme than the stem-based VSM. Medical document retrieval using the phrase-based VSM is significantly more effective than that using the stem-based VSM.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Data & Knowledge Engineering - Volume 61, Issue 1, April 2007, Pages 76–92
نویسندگان
, ,