کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6854227 1437408 2018 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Extracting topic-sensitive content from textual documents-A hybrid topic model approach
ترجمه فارسی عنوان
استخراج محتوای حساس به موضوع از اسناد متنی - رویکرد مدل موضوعی ترکیبی
کلمات کلیدی
محتوای موضوعی مدل سازی امکان پذیری، شبکه موضوعی، نیمه نظارت،
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی
When exploring information of a topic, users often concern its different aspects. For instance, product designers are interested in seeking information of specific topic aspects such as technical challenge and usability from online consumer opinions, while potential buyers wish to obtain general sentiment of public opinions. In this paper, we study an interesting problem called topic-sensitive content extraction (TSCE). TSCE aims to extract contents that are relevant to the samples of topic aspects highlighted by users from a single document in a given text collection. To tackle TSCE, we have proposed a new hybrid topic model which integrates different structures in both topic space and context space. It focuses on identifying contents associated with a specified topic aspect from each document. By modeling gradient documents via term profiles for context modeling and by leveraging local and global differences between probability distributions over words in both topic modeling and context modeling, it has better captured the features of various language patterns. Hence, sentence relevance ranking according to a specific topic aspect is largely improved. The experimental studies on extracting critical contents of specific aspects, including motivation and design solution, from technical patents for design analysis have shown the merits of the proposed modeling.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Engineering Applications of Artificial Intelligence - Volume 70, April 2018, Pages 81-91
نویسندگان
, , , ,