Using Dimensional Modeling

Creating and Using Data Warehouses

Creating and Using Data Warehouses

Using Dimensional Modeling

Entity-relation modeling is often used to create a single complex model of all of the organization's processes. This approach has proven effective in creating efficient online transaction processing (OLTP) systems. In contrast, dimensional modeling creates individual models to address discrete business processes. For example, sales information may go to one model, inventory to another, and customer accounts to yet another. Each model captures facts in a fact table and attributes of those facts in dimension tables linked to the fact table. The schemas produced by these arrangements are called star or snowflake schemas, and have been proven effective in data warehouse design.

Dimensional modeling organizes information into structures that often correspond to the way analysts want to query data warehouse data. For example, the question, "What were the sales of food items in the northwest region in the third quarter of 1999?" represents the use of three dimensions (product, geography, time) to specify the information to be summarized.

A Data Warehouse Model

A simple dimensional model of sales information might include a fact table named Sales_Fact that contains one record for each line item of each sale, capturing the quantity sold, the unit cost, and the sale value. Varieties of information about a sales record might include the customer, the store where the sale occurred, the time and date of the sale, and the product sold. Each of these categories of information is organized into its own dimension table. Customer information is placed in a Customer dimension table, store information in a Store dimension table, time and date information in a Time dimension table, and product information in a Product dimension table.

In a star schema, each dimension table has a single-part primary key that links to one part of the multipart primary key in the fact table. In a snowflake schema, one or more dimension tables are decomposed into multiple tables with the subordinate dimension tables joined to a primary dimension table instead of to the fact table. In most designs, star schemas are preferable to snowflake schemas because they involve fewer joins for information retrieval and are easier to manage.