Optimizing Cube Schemas

Analysis Services

Analysis Services

Optimizing Cube Schemas

In many situations Microsoft® SQL Server™ 2000 Analysis Services can optimize a cube's schema to significantly reduce cube processing time by eliminating joins between dimension tables and fact tables.

During dimension processing, the Analysis server creates an internal representation of the dimension data and hierarchy. When processing a cube, the dimension member keys identified in the member key column property are used to access the information in the internal representation of the processed dimension. Under certain conditions, the dimension member's foreign key in the fact table can be used for this lookup, thereby eliminating the need to join the dimension table to the fact table in the database query. This significantly reduces the complexity of the query, the amount of data accessed in the relational database, and network traffic between the Analysis server and the relational database.

To take advantage of cube schema optimization, when you design a cube in Cube Editor, click the Optimize Schema command on the Tools menu. Analysis Services then modifies the schema to eliminate joins between the fact table and dimension tables, where possible. Certain conditions must be met for Analysis Services to eliminate a join between a dimension and the fact table. These are:

  • The dimension must be a shared dimension, and must have been processed before you optimize the cube schema.

  • The member key column for the lowest level of the dimension must contain the keys that relate the fact table and the dimension table. This must be the only key necessary to relate the fact table to the dimension table.

  • The keys in the member key column for the lowest level of the dimension must be unique.

  • The lowest level of the dimension must be represented in the cube, that is, the level's Disabled property must be set to No. The level can be hidden.

If these conditions are met, and the cube's schema is optimized using the Optimize Schema option, the Analysis server ignores the dimension table in the database when processing the cube. If these conditions are met for all dimensions in the cube, the Analysis server needs to read only the fact table to process the cube. Processing time reductions often can be substantial when this optimization technique is used.

Cube schema optimization applies to all partitions of the cube whether the partitions are processed independently or as a group.

Note  You should not optimize a cube's schema if you depend on inner joins between the fact table and dimension tables to exclude fact rows for the cube content. The entire fact table is read if all dimension table joins are removed by this optimization.

Because schema optimization can eliminate joins, a cube with an optimized schema may not display all available tables for use when specifying drillthrough options. You can join a table to the schema for drillthrough when specifying drillthrough options by adding the table and defining a SQL WHERE clause to establish the join. For more information, see Specifying Drillthrough Options.

Member Key Column

Analysis Services uses the Member Key Column property of the lowest level of a dimension to control cube schema optimization. During cube schema optimization, each dimension is evaluated to determine if it meets the conditions for optimization. If the dimension meets the required conditions, the Member Key Column property of the lowest level of the dimension is changed to refer to the foreign key in the fact table instead of the key in the dimension table. For example, before optimization the dimension level's Member Key Column is "Products"."SKU_ID", which is joined to the key, "Facts"."SKU_Key", in the fact table. After optimization, the Member Key Column property value is "Facts"."SKU_Key". This signals the Analysis server to use the key from the fact table during processing instead of issuing queries that join the dimension to the fact table in the relational database.

Example

A dimension for time contains the levels Year, Quarter, Month, and Day. There is a dimension member for each day in the dimension, and each day member has a unique key, which is specified as the member key column for the Day level. The Member Keys Unique property for the Day level is set to True.

The dimension's Day level member keys are used as foreign keys in the fact table to relate the dimension table to the fact table. No other keys are required to uniquely relate a fact table row to a row in the dimension table.

A cube is designed that uses this fact table and this time dimension. It is preferable that the cube contain summarized data at the Month level and above but not at the Day level. In the cube, the Disabled property for the dimension's Day level is set to No, so the level keys will be available for the cube processing optimization. The Visible property for the dimension's Day level is set to False, so the cube will not display data for the Day level.

When the Optimize Schema command on the Tools menu is selected, the cube schema is optimized. Then, when the cube is saved and processed, the SQL query issued by the Analysis server to read the fact table will not need to join or access the dimension table.

Modifying Cube Schema Optimization

You can remove the optimization for one or more dimensions in a cube by changing the Member Key Column property for the lowest level of each of the dimensions to refer to its original column in the dimension table. This will cause the Analysis server to issue a query that joins the dimension table to the fact table during processing.

Note  A cube's schema optimization can be affected by adding or deleting dimensions, or by modifying dimension properties in the cube. You should check the cube's schema optimization or redo the optimization whenever you make such changes.

Unknown Dimension Member Error

This error indicates that a dimension member's key is not found in the internal representation of a dimension when processing a cube that contains the dimension. The cause can be either that a dimension has not been processed after new members were added, or that the dimension table does not contain a key that matches a key found in the fact table.

This error occurs regardless of whether a cube's schema has been optimized if new members are added to a dimension and related facts are added to the fact table but the dimension has not been processed. It makes no difference whether the member keys are read from the joined dimension (schema not optimized) or from the member foreign keys in the fact table (schema optimized). The internal representation of the dimension will not contain the new keys until the dimension has been processed.

There is one situation where this error is triggered if the cube's schema has been optimized, but is not be triggered if the schema has not been optimized. This condition occurs when a fact has been added to the fact table but no corresponding member exists in a dimension table. If the cube's schema has been optimized, the key for the new fact will be read from the fact table but not found in the internal representation of the dimension, even if the dimension has been processed. However, if the cube's schema has not been optimized, the query that joins the dimension table to the fact table causes any facts that do not have corresponding dimension members to be ignored and not read during processing, so the error is not triggered.

You can avoid these errors by maintaining referential integrity between dimension tables and fact tables, and by always processing a dimension after making changes to the dimension table and before processing cubes that use the dimension.