کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
528431 | 869568 | 2015 | 13 صفحه PDF | دانلود رایگان |
• Fusion functions for multi-valued data structures like sets and bags are investigated.
• Particular attention is given to pointwise functions.
• Mathematical and behavioral properties of these functions are investigated.
• Some particular functions are evaluated w.r.t. the (combination of) quality measures they optimize.
Assessment and improvement of data quality is a major challenge with any modern information source. An aspect of data quality that has gained a lot of interest in the past decades, is the detection and fusion of duplicate data. This paper contributes to the field of duplicate data fusion by investigating a framework of fusion functions. In particular, it is observed that multisets are a data structure for which little is known concerning fusion theory. Therefore, a class of multi-valued functions called pointwise fusion functions, is proposed and investigated. An extensive list of properties is defined in order to compare the behavior of multi-valued fusion functions. Some specific pointwise fusion functions are investigated with respect to the defined properties and they are evaluated in different fusion scenarios. Next, some quality measures are discussed and their usefulness in the different fusion scenarios is discussed.
Journal: Information Fusion - Volume 25, September 2015, Pages 121–133