کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
552375 1451056 2016 13 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A novel trend surveillance system using the information from web search engines
ترجمه فارسی عنوان
یک سیستم نظارت روند جدید با استفاده از اطلاعات از موتورهای جستجو وب
کلمات کلیدی
نظارت روند. یادگیری برای رتبه؛ داده کاوی؛ انتخاب ویژگی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر سیستم های اطلاعاتی
چکیده انگلیسی


• Propose an adaptive trend surveillance framework and an effective feature selection algorithm TF-LTR
• Investigated pair-wise learning to rank models to measure a term's discriminative power
• Support government officials and authorities to construct effective and efficient trend surveillance systems

Web search engines are becoming a major platform for the general public to access information. It has been suggested that because the search patterns of search engine users are correlated with emerging events, the query log of search engines has the potential for trend surveillance, such as monitoring outbreaks of epidemics. Many trend surveillance studies have investigated the use of query logs and have strived to identify query terms suitable for trend surveillance. Most of these works select representative query terms by consulting domain experts or by preparing a large text corpus for feature selection. The process of these approaches, however, is too costly to make the trend surveillance methods adaptable to different topics. In this paper, we propose an adaptive trend surveillance method. We developed a simple and effective feature selection algorithm, called TF-LTR, which leverages the document returned by search engines and the frequency of the terms in the returned documents to select representative query terms of trending topics. Specifically, we investigated pair-wise learning to rank models in order to measure a term's discriminative power in making a document rank higher in the returned document list. The discriminative power is combined with the term frequency which denotes the on-topic degree of a term to measure a term's representativeness against a trending topic. Representative terms and their query frequencies are applied to a state-of-the-art data mining model to enhance the effectiveness of trend surveillance. The experimental results based on trending topics of different domains show that our trend surveillance method performs well and the ranking information of search engines are helpful for trend surveillance. In light of this, the proposed method can provide effective support for government officials and authorities in order to help them to respond to fast-changing events and topics, and to make appropriate decisions.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Decision Support Systems - Volume 88, August 2016, Pages 85–97
نویسندگان
, ,