کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
496306 | 862856 | 2012 | 17 صفحه PDF | دانلود رایگان |

After projecting high dimensional data into a two-dimension map via the SOM, users can easily view the inner structure of the data on the 2-D map. In the early stage of data mining, it is useful for any kind of data to inspect their inner structure. However, few studies apply the SOM to transactional data and the related categorical domain, which are usually accompanied with concept hierarchies. Concept hierarchies contain information about the data but are almost ignored in such researches. This may cause mistakes in mapping. In this paper, we propose an extended SOM model, the SOMCD, which can map the varied kinds of data in the categorical domain into a 2-D map and visualize the inner structure on the map. By using tree structures to represent the different kinds of data objects and the neurons’ prototypes, a new devised distance measure which takes information embedded in concept hierarchies into consideration can properly find the similarity between the data objects and the neurons. Besides the distance measure, we base the SOMCD on a tree-growing adaptation method and integrate the U-Matrix for visualization. Users can hierarchically separate the trained neurons on the SOMCD's map into different groups and cluster the data objects eventually. From the experiments in synthetic and real datasets, the SOMCD performs better than other SOM variants and clustering algorithms in visualization, mapping and clustering.
Figure optionsDownload as PowerPoint slideHighlights
► The SOMCD extends the application scope of the conventional SOM to include the transactional data which are accompanied with a concept hierarchy.
► A distance function is devised for the categorical domain such that the relevancy information embedded in concept hierarchies can be measured.
► The SOMCD is a total solution in projecting, visualization, and clustering for data which are related to the categorical domain.
Journal: Applied Soft Computing - Volume 12, Issue 10, October 2012, Pages 3141–3157