کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
517674 867490 2009 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Improving accuracy for identifying related PubMed queries by an integrated approach
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
Improving accuracy for identifying related PubMed queries by an integrated approach
چکیده انگلیسی

PubMed is the most widely used tool for searching biomedical literature online. As with many other online search tools, a user often types a series of multiple related queries before retrieving satisfactory results to fulfill a single information need. Meanwhile, it is also a common phenomenon to see a user type queries on unrelated topics in a single session. In order to study PubMed users’ search strategies, it is necessary to be able to automatically separate unrelated queries and group together related queries. Here, we report a novel approach combining both lexical and contextual analyses for segmenting PubMed query sessions and identifying related queries and compare its performance with the previous approach based solely on concept mapping.We experimented with our integrated approach on sample data consisting of 1539 pairs of consecutive user queries in 351 user sessions. The prediction results of 1396 pairs agreed with the gold-standard annotations, achieving an overall accuracy of 90.7%. This demonstrates that our approach is significantly better than the previously published method. By applying this approach to a one day query log of PubMed, we found that a significant proportion of information needs involved more than one PubMed query, and that most of the consecutive queries for the same information need are lexically related. Finally, the proposed PubMed distance is shown to be an accurate and meaningful measure for determining the contextual similarity between biological terms. The integrated approach can play a critical role in handling real-world PubMed query log data as is demonstrated in our experiments.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Biomedical Informatics - Volume 42, Issue 5, October 2009, Pages 831–838
نویسندگان
, ,