Partitioning

Optimizing SQL Database Performance

Optimizing Database Performance

Partitioning

Partitioning a database improves performance and simplifies maintenance. By splitting a large table into smaller, individual tables, queries accessing only a fraction of the data can run faster because there is less data to scan. Maintenance tasks, such as rebuilding indexes or backing up a table, can execute more quickly.

Partitioning can be achieved without splitting tables by physically placing them on individual disk drives. Placing a table on one physical drive and related tables on a separate drive, for example, can improve query performance because when queries involving joins between the tables are executed, multiple disk heads read data at the same time. Microsoft® SQL Server™ 2000 filegroups can be used to specify on which disks to place the tables.

Hardware Partitioning

Hardware partitioning designs the database to take advantage of the available hardware architecture. Examples of hardware partitioning include:

  • Multiprocessors that allow multiple threads of execution, permitting many queries to execute at the same time. Alternatively, a single query may be able to run faster on multiple processors by allowing components of the query to be executed simultaneously. For example, each table referenced in the query can be scanned at the same time by a different thread.

  • RAID (redundant array of independent disks) devices that allow data to be striped across multiple disk drives, permitting faster access to the data because more read/write heads read data at the same time. A table striped across multiple drives can typically be scanned faster than the same table stored on one drive. Alternatively, storing tables on separate drives from related tables can significantly improve the performance of queries joining those tables.
Horizontal Partitioning

Horizontal partitioning segments a table into multiple tables, each containing the same number of columns but fewer rows. For example, a table containing 1 billion rows could be partitioned horizontally into 12 tables, with each smaller table representing one month of data for a given year. Any queries requiring a specific month's data reference the appropriate table only.

Determining how to partition the tables horizontally depends on how data is analyzed. Partition the tables so that queries reference as few tables as possible. Otherwise, excessive UNION queries, used to merge the tables logically at query time, can impair performance. For more information about querying horizontally partitioned tables, see Scenarios for Using Views.

Partitioning data horizontally based on age/use is common. For example, a table may contain data for the last five years, but only data from the current year is regularly accessed. In this case, you may consider partitioning the data into five tables, with each table containing data from only one year.

Vertical Partitioning

Vertical partitioning segments a table into multiple tables containing fewer columns. The two types of vertical partitioning are normalization and row splitting.

Normalization is the standard database process of removing redundant columns from a table and placing them in secondary tables linked to the primary table by primary key and foreign key relationships.

Row splitting divides the original table vertically into tables with fewer columns. Each logical row in a split table matches the same logical row in the others. For example, joining the tenth row from each split table re-creates the original row.

Like horizontal partitioning, vertical partitioning allows queries to scan less data, hence increasing query performance. For example, a table containing seven columns, of which only the first four are commonly referenced, may benefit from splitting the last three columns into a separate table.

Vertical partitioning should be considered carefully because analyzing data from multiple partitions requires queries joining the tables, possibly affecting performance if partitions are very large.

See Also

Using Views with Partitioned Data