کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
10321297 659319 2005 36 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
HW-STALKER: A machine learning-based system for transforming QURE-Pagelets to XML
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
HW-STALKER: A machine learning-based system for transforming QURE-Pagelets to XML
چکیده انگلیسی
In this paper, we address the problem of extracting and transforming dynamically generated hyperlinked hidden web query results to XML. Our approach is based on the stalker approach. As stalker was designed to extract data from a single web page, it cannot handle a set of hyperlinked pages. We propose an algorithm called HW-Transform for transforming hidden web query results (also called QURE-Pagelets) to XML format using machine learning by extending stalker to handle hyperlinked hidden web pages. One of the key features of our approach is that we identify and transform key attributes of query results into XML attributes. These key attributes facilitate applications such as change detection and data integration by efficiently identifying related or identical results. Based on the proposed algorithm, we have implemented a prototype system called hw-stalker using Java. Our experiments demonstrate that HW-Transform shows acceptable performance for transforming QURE-Pagelets to XML.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Data & Knowledge Engineering - Volume 54, Issue 2, August 2005, Pages 241-276
نویسندگان
, , ,