کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
462101 696668 2010 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A weighted common structure based clustering technique for XML documents
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر شبکه های کامپیوتری و ارتباطات
پیش نمایش صفحه اول مقاله
A weighted common structure based clustering technique for XML documents
چکیده انگلیسی

XML has recently become very popular as a means of representing semistructured data and as a standard for data exchange over the Web, because of its varied applicability in numerous applications. Therefore, XML documents constitute an important data mining domain. In this paper, we propose a new method of XML document clustering by a global criterion function, considering the weight of common structures. Our approach initially extracts representative structures of frequent patterns from schemaless XML documents using a sequential pattern mining algorithm. Then, we perform clustering of an XML document by the weight of common structures, without a measure of pairwise similarity, assuming that an XML document is a transaction and frequent structures extracted from documents are items of the transaction. We conducted experiments to compare our method with previous methods. The experimental results show the effectiveness of our approach.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Systems and Software - Volume 83, Issue 7, July 2010, Pages 1267–1274
نویسندگان
, ,