Mining key information of web pages: A method and its application

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
387282	660898	2007	9 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Taxonomy - آرایه‌شناسی (ابهام‌زدایی)Entropy - آنتروپی Web content mining - استخراج محتوای وب web page - صفحه وب Ontology generation - نسل هستی شناسی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی

پیش نمایش صفحه اول مقاله

Mining key information of web pages: A method and its application

چکیده انگلیسی

Web content mining aims to discover useful information and generate desired knowledge from a large amount of web pages. Key information, such as distinctive menu items, navigation indicators, which is embedded in web pages, can help classify the main contents of web pages and reflect certain taxonomy knowledge. Therefore, mining key information is significant in helping acquire domain knowledge and build catalogue classifiers. Current web content mining methods cannot mine such key information effectively. “Noise information” (such as advertisements) is a problem for the performance of web mining tasks. This paper proposes a method to extract key information out of web pages which contain noisy information. The method contains two steps: to extract a list of candidate key information, and then apply entropy measure to filter noisy information and discover key information. Experiment results show that this method is effective in discovering key information. With the discovered key information that reflects taxonomy knowledge, an application is developed to help ontology generation.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 33, Issue 2, August 2007, Pages 425–433

نویسندگان

Chao Wang, Jie Lu, Guangquan Zhang,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Mining key information of web pages: A method and its application

دسترسی سریع

ارتباط

English Website