کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
459556 696264 2014 14 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Twitter data analysis by means of Strong Flipping Generalized Itemsets
ترجمه فارسی عنوان
تجزیه و تحلیل داده های توییتر با استفاده از کتب قوی تقویم
کلمات کلیدی
تحلیل شبکه و استخراج معادن شبکه، داده کاوی و کشف دانش، استخراج معادن اقلام عمومی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر شبکه های کامپیوتری و ارتباطات
چکیده انگلیسی


• We analyze Twitter user-generated content enriched with taxonomy hierarchies.
• We analyze changes in itemset correlation at different abstraction levels.
• We extract patterns having many unexpected low level correlation changes.
• We discover patterns useful for expert advanced analysis.

Twitter data has recently been considered to perform a large variety of advanced analysis. Analysis of Twitter data imposes new challenges because the data distribution is intrinsically sparse, due to a large number of messages post every day by using a wide vocabulary. Aimed at addressing this issue, generalized itemsets – sets of items at different abstraction levels – can be effectively mined and used to discover interesting multiple-level correlations among data supplied with taxonomies. Each generalized itemset is characterized by a correlation type (positive, negative, or null) according to the strength of the correlation among its items.This paper presents a novel data mining approach to supporting different and interesting targeted analysis – topic trend analysis, context-aware service profiling – by analyzing Twitter posts. We aim at discovering contrasting situations by means of generalized itemsets. Specifically, we focus on comparing itemsets discovered at different abstraction levels and we select large subsets of specific (descendant) itemsets that show correlation type changes with respect to their common ancestor. To this aim, a novel kind of pattern, namely the Strong Flipping Generalized Itemset (SFGI), is extracted from Twitter messages and contextual information supplied with taxonomy hierarchies. Each SFGI consists of a frequent generalized itemset X and the set of its descendants showing a correlation type change with respect to X.Experiments performed on both real and synthetic datasets demonstrate the effectiveness of the proposed approach in discovering interesting and hidden knowledge from Twitter data.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Systems and Software - Volume 94, August 2014, Pages 16–29
نویسندگان
, , , ,