Article ID Journal Published Year Pages File Type
396863 Information Systems 2014 17 Pages PDF
Abstract

•This work is among the first efforts to study the data complexity of relative information completeness.•Three problems fundamental to relative information completeness are identified and studied.•We provide the upper and lower bounds of these problems, all matching, for data complexity.

Databases in an enterprise are often partially closed: parts of their data must be contained in master data, which has complete information about the core business entities of the enterprise. With this comes the need for studying relative information completeness: a partially closed database is said to be complete for a query relative to master data if it has complete information to answer the query, i.e., extending the database by adding more tuples either does not change its answer to the query or makes it no longer partially closed w.r.t. the master data. This paper investigates three problems associated with relative information completeness. Given a query Q and a partially closed database D w.r.t. master data Dm, (1) the relative completeness problem is to decide whether D is complete for Q relative to Dm; (2) the minimal completeness problem is to determine whether D is a minimal database that is complete for Q relative to Dm; and (3) the bounded extension problem is to decide whether it suffices to extend D by adding at most K tuples, such that the extension makes a partially closed database that is complete for Q relative to Dm. While the combined complexity bounds of the relative completeness problem and the minimal completeness problem are already known, neither their data complexity nor the bounded extension problem has been studied. We establish upper and lower bounds of these problems for data complexity, all matching, for Q expressed in a variety of query languages.

Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , , ,