Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
378783 | Data & Knowledge Engineering | 2013 | 14 Pages |
Abstract
Discovering functional dependencies (FDs) from existing databases is important to knowledge discovery, machine learning and data quality assessment. A number of algorithms have been proposed in the literature. In this paper, we review and compare these algorithms to identify their advantages and differences. We then propose a simple but time and space efficient hash-based algorithm for FD discovery. We conduct a performance comparison of three recently published algorithms and compare their performance with that of our hash-based algorithm. We show that the hash-based algorithm performs best among the four algorithms and analyze the reasons.
Related Topics
Physical Sciences and Engineering
Computer Science
Artificial Intelligence
Authors
Jixue Liu, Feiyue Ye, Jiuyong Li, Junhu Wang,