کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
379320 | 659288 | 2008 | 32 صفحه PDF | دانلود رایگان |
![عکس صفحه اول مقاله: Sequence-based clustering for Web usage mining: A new experimental framework and ANN-enhanced K-means algorithm Sequence-based clustering for Web usage mining: A new experimental framework and ANN-enhanced K-means algorithm](/preview/png/379320.png)
We develop a general sequence-based clustering method by proposing new sequence representation schemes in association with Markov models. The resulting sequence representations allow for calculation of vector-based distances (dissimilarities) between Web user sessions and thus can be used as inputs of various clustering algorithms. We develop an evaluation framework in which the performances of the algorithms are compared in terms of whether the clusters (groups of Web users who follow the same Markov process) are correctly identified using a replicated clustering approach. A series of experiments is conducted to investigate whether clustering performance is affected by different sequence representations and different distance measures as well as by other factors such as number of actual Web user clusters, number of Web pages, similarity between clusters, minimum session length, number of user sessions, and number of clusters to form. A new, fuzzy ART-enhanced K-means algorithm is also developed and its superior performance is demonstrated.
Journal: Data & Knowledge Engineering - Volume 65, Issue 3, June 2008, Pages 512–543