Presentation Services

Creating and Using Data Warehouses

Creating and Using Data Warehouses

Presentation Services

The purpose of a data warehouse is to expose business information for use by decision makers in the organization. A data warehouse containing hundreds of millions of pieces of data is of no use to a decision maker without tools to assist in the analysis and evaluation process. These analysis tools can vary from simple reports to sophisticated data mining algorithms. Application programming interfaces (APIs) must also be available to support the development of custom applications that use data warehouse information.

Predefined Reports

Simple predefined summary reports can provide managers with periodic or on-demand snapshots of the state of the business at a point in time. More sophisticated reports can display trends of predetermined business variables. Such reports are useful and have historically been produced from online transaction (OLTP) systems. To capture up-to-the-minute status, snapshot detail and summary reports must continue to be produced from the data source systems. Periodic reports that are coordinated with data warehouse updates can be shifted to the data warehouse to reduce loads on operational systems. Reports that use historical data to evaluate trends should be accomplished in the data warehouse, which contains readily available historical data in appropriate formats, and which is designed to process large quantities of data for summarization.

You can prepare predefined reports in your Microsoft® SQL Server™ 2000 data warehouse by developing client applications that access either the relational database or multidimensional data cubes prepared by SQL Server 2000 Analysis Services. For more information, see Building SQL Server Applications Overview.

Online Analytical Processing

Predefined reports serve their specific purposes well but are not suited to explorative analysis. Analysts want to discover trends and anomalies in the data and explore various areas of the data to find the sources of these trends and anomalies. Online analytical processing (OLAP) is a tool designed to facilitate this kind of analysis of massive amounts of data warehouse data.

This example illustrates explorative analysis and OLAP. A manager notices a larger than normal sales amount in a predefined summary report and wants to find the cause. The cause may be external to the business, such as an unusual weather condition that drove the sales of related merchandise, or internal, such as a sales promotion. The increased sales may be spread across a geographic region or isolated to a single store, and they may have occurred over a short or relatively long period of time.

To explore data in the data warehouse, the manager asks questions of the data and then asks related or different questions based on the answers. The manager may first ask for the sales data summarized by week for the quarter in question, then drill down to the days of the week to find that the unusual sales amount is a one-time event. Having discovered the time of the event, the manager then explores the sales by product and finds that a large sale of a specific product was made on that day. Based on these answers, the manager then asks for the sales by region and drills down to find that a particular store filled a large order of a specific product on a specific day.

OLAP technology has been developed to facilitate this kind of explorative analysis. Analysis Services includes an Analysis server that uses OLAP technology to prepare large quantities of data warehouse data for exploration in real time. Multidimensional data structures called cubes are predefined and created to organize and summarize data warehouse data in such a way that typical explorative analysis questions can be answered with little or no querying of the relational database. In a typical Analysis Services implementation in a data warehouse, the manager in the example could find the answer in less than a minute because the Analysis server can answer each of the example questions in a second or two, even if there were millions of items in the data being explored. For more information, see Analysis Services Overview.

Data Mining

In contrast to OLAP, which organizes data into predefined multidimensional structures to facilitate exploration, data mining performs the explorative analysis and identifies interesting nuggets of information such as groupings of data for the analyst or manager to examine. Data mining can also create decision trees that can be used to predict future data based on attributes of existing data elements.

Analysis Services incorporates sophisticated data mining algorithms that can be used to analyze data in the relational database or in OLAP cubes. The results of the data mining analysis can also be used with OLAP cubes to enhance explorative analysis. For example, you can let data mining find groupings of customers according to their attributes and then use these groupings to create an additional dimensional view of OLAP data cubes and explore the data from the perspective of these groupings. For more information, see Analysis Services Overview.

Application Programming Interfaces

SQL Server 2000 provides a number of APIs that can be used to develop client applications tailored to your data warehouse needs. Some APIs provide access to database and tool object models so that custom administrative applications can be developed. Other APIs provide access to data through standard interfaces such as OLE DB for use by custom end-user applications. For more information, see Building SQL Server Applications Overview.