Article ID Journal Published Year Pages File Type
725329 The Journal of China Universities of Posts and Telecommunications 2011 7 Pages PDF
Abstract

In recent text mining research, there is a trend in analyzing the burst features of specific entity such as a word, a meme or a document in text streams. Such burst features can be efficiently and robustly identified by Kleinberg's two-state automaton model. However, the two parameters of the model, which is manually set, have heavily affected the performance of the model. In this paper, the function of the two parameters is examined, and two algorithms are proposed for the estimation of the two parameters. Experiments with public news corpora prove that our estimation can maximize the reliability of the detection results and remove the noisy burst features effectively.

Related Topics
Physical Sciences and Engineering Engineering Electrical and Electronic Engineering