On discovery of functional dependencies from data

Article ID	Journal	Published Year	Pages	File Type
378783	Data & Knowledge Engineering	2013	14 Pages	PDF

Abstract

Discovering functional dependencies (FDs) from existing databases is important to knowledge discovery, machine learning and data quality assessment. A number of algorithms have been proposed in the literature. In this paper, we review and compare these algorithms to identify their advantages and differences. We then propose a simple but time and space efficient hash-based algorithm for FD discovery. We conduct a performance comparison of three recently published algorithms and compare their performance with that of our hash-based algorithm. We show that the hash-based algorithm performs best among the four algorithms and analyze the reasons.

Keywords

Data mining Functional dependencies Knowledge discovery