Article ID Journal Published Year Pages File Type
459565 Journal of Systems and Software 2014 10 Pages PDF
Abstract

•We identify that SCDs can have three types of validity periods.•We propose SKTDW to efficiently identify data and avoid most time comparisons.•We compare TDW and SKTDW using query formulation, query performance, and warehouse size.•SKTDW outperforms TDW for all type of queries.•The average and maximum performance improvements are 165% and 1071%, respectively.

Analysis of historical data in data warehouses contributes significantly toward future decision-making. A number of design factors including, slowly changing dimensions (SCDs), affect the quality of such analysis. In SCDs, attribute values may change over time and must be tracked. They should maintain consistency and correctness of data, and show good query performance. We identify that SCDs can have three types of validity periods: disjoint, overlapping, and same validity periods. We then show that the third type cannot be handled through the temporal star schema for temporal data warehouses (TDWs). We further show that a hybrid/Type6 scheme and temporal star schema may be used to handle this shortcoming. We demonstrate that the use of a surrogate key in the hybrid scheme efficiently identifies data, avoids most time comparisons, and improves query performance. Finally, we compare the TDWs and a surrogate key-based temporal data warehouse (SKTDW) using query formulation, query performance, and data warehouse size as parameters. The results of our experiments for 23 queries of five different types show that SKTDW outperforms TDW for all type of queries, with average and maximum performance improvements of 165% and 1071%, respectively. The results of our experiments are statistically significant.

Related Topics
Physical Sciences and Engineering Computer Science Computer Networks and Communications
Authors
, ,