کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
528220 | 869540 | 2016 | 18 صفحه PDF | دانلود رایگان |
• Fusion functions based on order relations are formalized.
• It is pointed out that an appropriate order relation is not always at hand.
• The DOC algorithm to construct an appropriate order relation dynamically, is provided.
• Selection strategies are discussed.
• A thorough experimental evaluation shows the benefits of the proposed techniques.
A crucial operation in the maintenance of data quality in relational databases is to remove tuples that mutually describe the same entity (i.e., duplicate tuples) and to replace them with a tuple that minimizes information loss. A function that combines multiple tuples into one is called a fusion function. In this paper, we investigate fusion functions for attributes of which the values can be sorted by means of an order relation that reflects a notion of generality. It is shown that providing such an order relation a priori, let alone keeping it up-to-date, is a costly operation. Therefore, the Dynamical Order Construction (DOC) algorithm is proposed that constructs an order relation in an automated fashion upon inspecting the data that need to be fused. Such order relations can be immediately deployed in a framework of selectional fusion functions, which are fusion functions that adopt the sort-and-select principle. These fusion functions are investigated closely in terms of their selection strategies. An experimental evaluation of our method shows the influence of the parameters and the benefit with respect to using a fixed and predefined taxonomy.
Journal: Information Fusion - Volume 27, January 2016, Pages 1–18