Article ID Journal Published Year Pages File Type
395686 Information Sciences 2008 12 Pages PDF
Abstract

The need for measuring the dispersion of nominal categorical attributes appears in several applications, like clustering or data anonymization. For a nominal attribute whose categories can be hierarchically classified, a measure of the variance of a sample drawn from that attribute is proposed which takes the attribute’s hierarchy into account. The new measure is the reciprocal of “consanguinity”: the less related the nominal categories in the sample, the higher the measured variance. For non-hierarchical nominal attributes, the proposed measure yields results consistent with previous diversity indicators. Applications of the new nominal variance measure to economic diversity measurement and data anonymization are also discussed.

Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, ,