Article ID Journal Published Year Pages File Type
429351 Journal of Computational Science 2015 11 Pages PDF
Abstract

•We built a system for event detection and trending from tweet clusters which are discovered using locality sensitive hashing (LSH) technique.•Construction of feature vectors in high dimensional dataset.•Leveraging cluster-discovery using locality sensitive hashing to find truly interested events and record their attributes in MySQL database.•Trending the event behavior over time, geo-locations and cluster size.

Social media data carries abundant hidden occurrences of real-time events. In this paper, a novel methodology is proposed for detecting and trending events from tweet clusters that are discovered by using locality sensitive hashing (LSH) technique. Key challenges include: (1) construction of dictionary using incremental term frequency–inverse document frequency (TF–IDF) in high-dimensional data to create tweet feature vector, (2) leveraging LSH to find truly interesting events, (3) trending the behavior of event based on time, geo-locations and cluster size, and (4) speed-up the cluster-discovery process while retaining the cluster quality. Experiments are conducted for a specific event and the clusters discovered using LSH and K-means are compared with group average agglomerative clustering technique.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, ,