کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
486463 | 703373 | 2013 | 8 صفحه PDF | دانلود رایگان |

In this paper a generalized approach is proposed for clustering a set of given documents or text files or software components for reuse based on the new similarity function called hybrid XOR function defined for the purpose of finding degree of similarity among two document sets or any two software components. We construct a matrix called similarity matrix of order n-1 by n for n document sets or software components by applying hybrid XOR function for each pair of document sets. We define and design the clustering algorithm which has its input as similarity matrix and output as a set of clusters formed dynamically as compared to other clustering algorithms that predefine the count of clusters and documents being fit to one of those clusters or classes finally. The approach carried out uses simple computations.
Journal: Procedia Computer Science - Volume 17, 2013, Pages 121-128