کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
379320 659288 2008 32 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Sequence-based clustering for Web usage mining: A new experimental framework and ANN-enhanced K-means algorithm
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Sequence-based clustering for Web usage mining: A new experimental framework and ANN-enhanced K-means algorithm
چکیده انگلیسی

We develop a general sequence-based clustering method by proposing new sequence representation schemes in association with Markov models. The resulting sequence representations allow for calculation of vector-based distances (dissimilarities) between Web user sessions and thus can be used as inputs of various clustering algorithms. We develop an evaluation framework in which the performances of the algorithms are compared in terms of whether the clusters (groups of Web users who follow the same Markov process) are correctly identified using a replicated clustering approach. A series of experiments is conducted to investigate whether clustering performance is affected by different sequence representations and different distance measures as well as by other factors such as number of actual Web user clusters, number of Web pages, similarity between clusters, minimum session length, number of user sessions, and number of clusters to form. A new, fuzzy ART-enhanced K-means algorithm is also developed and its superior performance is demonstrated.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Data & Knowledge Engineering - Volume 65, Issue 3, June 2008, Pages 512–543
نویسندگان
, , ,