Scheduling Data Updates

Creating and Using Data Warehouses

Creating and Using Data Warehouses

Scheduling Data Updates

Data warehouses used for the analysis of historical business data are better served by periodic updates than they are by online data updates. Constantly changing data can interfere with the analysis process, which requires time and iterative querying to refine and verify results. The frequency of data warehouse updates depends on the needs of the organization and the uses of the data warehouse. Typical update periods may be monthly, weekly, or daily. There are several items to consider when scheduling data warehouse updates.

Although the amount of data involved in a data warehouse update is usually much less than the amount of data initially loaded, the process of extracting, cleansing and transforming, and loading the data can still adversely affect the operational systems involved. When you load data into the data warehouse, you are populating the tables that will be used by the presentation applications that make the data available to users. Loading data often involves the transfer of large amounts of data from source operational systems, a data preparation area database, or preparation area tables in the data warehouse database. Such operations can impose significant processing loads on the databases involved and should be accomplished during periods of relatively low system use.

You must also coordinate data warehouse update operations with the operation of the presentation services that make the data available to users. If you are loading large amounts of data, you may want to perform bulk loads without using transactions or logging. In this case, you may need to take the data warehouse offline during the loading operation to prevent users from accessing data that could be inconsistent while it is being loaded to various tables.

Presentation applications that provide access to the data warehouse by end users may require adjustment to accommodate new data. For example, Microsoft® SQL Server™ 2000 Analysis Services cubes often require updating or reprocessing to incorporate new data in the data warehouse. For more information, see Synchronizing OLAP Cubes.