کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
392968 665210 2016 18 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Specification and discovery of web patterns: a graph grammar approach
ترجمه فارسی عنوان
مشخصات و کشف الگوهای وب: رویکرد گرامر گراف
کلمات کلیدی
الگوهای وب، دستور زبان فضایی، الگوریتم گرامر گراف
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی

Finding useful information from the Web becomes increasingly difficult as the volume of Web data rapidly grows. To facilitate effective Web browsing, Web designers usually display the same type of information with a consistent layout (referred to as a Web pattern). Discovering Web patterns can benefit many applications, such as extracting structured data. This paper presents a generic framework for discovering Web patterns and recognizing their instances (i.e., structured data) based on graph grammars. In our framework, a Web pattern is visually yet formally specified as a graph grammar, which is automatically induced through a grammar induction engine. The grammar induction engine is featured by converting the problem of (2-dimensional) graph grammar induction to (1-dimensional) string induction. Based on the induced pattern, matching instances are recognized from Web pages through a graph parsing process. We have evaluated the framework on twenty-one e-commerce Web sites. The evaluation results are promising with a high F1-score.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Sciences - Volume 328, 20 January 2016, Pages 528–545
نویسندگان
, , ,