کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
396635 670435 2007 12 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Semantic sampling of existing databases through informative Armstrong databases
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Semantic sampling of existing databases through informative Armstrong databases
چکیده انگلیسی

Functional dependencies (FDs) and inclusion dependencies (INDs) convey most of data semantics in relational databases and are very useful in practice since they generalize keys and foreign keys. Nevertheless, FDs and INDs are often not available, obsolete or lost in real-life databases. Several algorithms have been proposed for mining these dependencies, but the output is always in the same format: a simple list of dependencies, hard to understand for the user. In this paper, we define informative Armstrong databases (IADBs) from databases as being small subsets of an existing database, satisfying exactly the same FDs and INDs. They are an extension of the classical notion of Armstrong databases, but more suitable for the understanding of dependencies, since tuples are real-world tuples. The main result of this paper is to bound the size of an IADB in the case of non-circular INDs. A constructive proof of this result is given, from which an algorithm has been devised. An implementation and experiments against a real-life database were performed; the obtained database contains 0.6% of the initial database tuples only. More importantly, such semantic sampling of databases appear to be a key feature for the understanding of existing databases at the logical level.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Systems - Volume 32, Issue 3, May 2007, Pages 446–457
نویسندگان
, ,