Article ID Journal Published Year Pages File Type
491002 Procedia Technology 2012 6 Pages PDF
Abstract

Data Warehouses store integrated and consistent data in a subject-oriented data repository dedicated especially to support business intelligence processes. Nevertheless, in order to maintain a data warehouse up-to-date, data intensive tasks retrieve regularly specialized information from specific preselected information sources, transforming and conforming it accordingly to some specific business requirements provided by decision-makers. Such tasks, commonly named as Extract-Transform-Load (ETL) processes, have a limited time frame window to be executed over an ever increasing amount of data with extremely complex operations. The common approach to deal with the need of more computational power is the acquisition of new and more powerful hardware. This expensive approach disregards the unused computational resources available in desktop computers already present at most enterprises’ computational environments. This paper intends to define a different approach to deal with ETL processes, taking advantage of parallel processing over a GRID environment using XML data as an effective support to data storage and communication, demonstrating that GRID environments could be a real alternative for the implementation of low cost data warehouses.

Related Topics
Physical Sciences and Engineering Computer Science Computer Science (General)