Article ID Journal Published Year Pages File Type
378783 Data & Knowledge Engineering 2013 14 Pages PDF
Abstract

Discovering functional dependencies (FDs) from existing databases is important to knowledge discovery, machine learning and data quality assessment. A number of algorithms have been proposed in the literature. In this paper, we review and compare these algorithms to identify their advantages and differences. We then propose a simple but time and space efficient hash-based algorithm for FD discovery. We conduct a performance comparison of three recently published algorithms and compare their performance with that of our hash-based algorithm. We show that the hash-based algorithm performs best among the four algorithms and analyze the reasons.

Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , , ,