کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
472664 698737 2011 14 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Automatically extracting user reviews from forum sites
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
پیش نمایش صفحه اول مقاله
Automatically extracting user reviews from forum sites
چکیده انگلیسی

User reviews in forum sites are the important information source for many popular applications (e.g., monitoring and analysis of public opinion), which are usually represented in form of structured records. To the best of our knowledge, little existing work reported in the literature has systemically investigated the problem of extracting user reviews from forum sites. Besides the variety of web page templates, user-generated reviews raise two new challenges. First, the inconsistency of review contents in terms of both the document object model (DOM) tree and visual appearance impair the similarity between review records; second, the review content in a review record corresponds to complicated subtrees rather than single nodes in the DOM tree. To tackle these challenges, we present WeRE — a system that performs automatic user review extraction by employing sophisticated techniques. The review records are extracted from web pages based on the proposed level-weighted tree similarity algorithm first, and then the review contents in records are extracted exactly by measuring the node consistency. Our experimental results based on 20 forum sites indicate that WeRE can achieve high extraction accuracy.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computers & Mathematics with Applications - Volume 62, Issue 7, October 2011, Pages 2779–2792
نویسندگان
, , ,