Prevalent Data Warehouse Development Approaches

There are two prevalent approaches to the development of Datawarehouse Architectures:

  1. Data Warehouse (DWH) bus architecture (introduced by Ralph Kimball)

According to this approach the DWH is developed in phases. Each phase includes the development of a set of dimensional models which are linked together via conformed dimensions, thus forming a virtual bus architecture. Therefore, according to this approach, at the core of the DWH resides a denormalised dimensional data model, which handles data at the atomic level.The major advantages of this approach are inherited from the use of the dimensional model combined with the conformed dimensions principle. This models simple and symmetric structure is easily understood by Business Analysts (easier than complex normalized data models). Moreover the so called star schema allows the efficient execution of queries (less relational joins). The conformed dimensions principle allows for the gradual development of a Data Warehouse, in which all information is linked efficiently and analytics spanning different business processes or subject areas are feasible. Each star schema involves a fact table linked to a number of dimensions in a star.Three fundamental types of fact tables: transaction, periodic snapshot, and accumulating snapshot have been defined. In order to define a DWH development roadmap, Kimball introduced the concept of the DWH bus matrix. The bible on this approach is: The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling, John Wiley & Sons, 2002 Ralph Kimball and Margy Ross

  1. Corporate Information Factory CIF (introduced by Bill Inmon)

According to this approach, the first step involves the design of a comprehensive abstract data model for the Enterprise (a model mapping the way the Enterprise exploits information). Based on this abstract model, the central DWH data model is developed following a normalized design approach (3NF), which handles data at the atomic level.According to the Inmon approach, dimensional models embedding aggregated facts are built by querying this central atomic DWH data model and serve departmental needs (this is one of the major disagreements between the two thought leaders read Kimballs open letter to the data warehousing community).Both development approaches agree to the following points:

  • Phased-iterative implementation is the way to proceed, by prioritizing on business processes or subject areas.
  • The use of a separate staging area in which extraction-transformation-cleansing of source data takes places in order to be loaded to the DWH (known as ETL operations).
  • The power of information resides in the atomic data, which embed all available information dimensionality.

Copyright 2006 Kostis Panayotakis

Author: Kostis Panayotakis
Article Source: EzineArticles.com
Provided by: Canada duty


You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Leave a Reply

You must be logged in to post a comment.