کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
379209 659274 2009 19 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
An unsupervised method for joint information extraction and feature mining across different Web sites
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
An unsupervised method for joint information extraction and feature mining across different Web sites
چکیده انگلیسی

We develop an unsupervised learning framework which can jointly extract information and conduct feature mining from a set of Web pages across different sites. One characteristic of our model is that it allows tight interactions between the tasks of information extraction and feature mining. Decisions for both tasks can be made in a coherent manner leading to solutions which satisfy both tasks and eliminate potential conflicts at the same time. Our approach is based on an undirected graphical model which can model the interdependence between the text fragments within the same Web page, as well as text fragments in different Web pages. Web pages across different sites are considered simultaneously and hence information from different sources can be effectively leveraged. An approximate learning algorithm is developed to conduct inference over the graphical model to tackle the information extraction and feature mining tasks. We demonstrate the efficacy of our framework by applying it to two applications, namely, important product feature mining from vendor sites, and hot item feature mining from auction sites. Extensive experiments on real-world data have been conducted to demonstrate the effectiveness of our framework.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Data & Knowledge Engineering - Volume 68, Issue 1, January 2009, Pages 107–125
نویسندگان
, ,