I/O Architecture

SQL Server Architecture

SQL Server Architecture

I/O Architecture

The primary purpose of a database is to store and retrieve data, so performing a lot of disk reads and writes is one of the inherent attributes of a database engine. Disk I/O operations consume many resources and take a relatively long time to complete. Much of the logic in relational database software concerns making the pattern of I/O usage highly efficient.

Microsoft® SQL Server™ 2000 allocates much of its virtual memory to a buffer cache and uses the cache to reduce physical I/O. Each instance of SQL Server 2000 has its own buffer cache. Data is read from the database disk files into the buffer cache. Multiple logical reads of the data can be satisfied without requiring that the data be physically read again. The data remains in the cache until it has not been referenced for some time and the database needs the buffer area to read in more data. Data is written back to disk only if it is modified. Data can be changed multiple times by logical writes before a physical write transfers the new data back to disk.

The data in a SQL Server 2000 database is stored in 8-KB pages. Each group of eight contiguous pages is a 64-KB extent. The buffer cache is also divided into 8-KB pages.

The I/O from an instance of SQL Server is divided into logical and physical I/O. A logical read occurs every time the database engine requests a page from the buffer cache. If the page is not currently in the buffer cache, a physical read is then performed to read the page into the buffer cache. If the page is currently in the cache, no physical read is generated; the buffer cache simply uses the page already in memory. A logical write occurs when data is modified in a page in memory. A physical write occurs when the page is written to disk. It is possible for a page to remain in memory long enough to have more than one logical write made before it is physically written to disk.

One of the basic performance optimization tasks for an instance of SQL Server involves sizing the SQL Server memory. The goal is to make the buffer cache large enough to maximize the ratio of logical reads to physical reads, but not so large that excessive memory swapping starts generating physical I/O to the pagefile. Instances of SQL Server 2000 do this automatically under the default configuration settings.

By maintaining a relatively large buffer cache in virtual memory, an instance of SQL Server can significantly reduce the number of physical disk reads it requires. After a frequently referenced page has been read into the buffer cache, it is likely to remain there, eliminating further reads.

SQL Server 2000 uses two Microsoft Windows NT® and Windows® 2000 features to improve its disk I/O performance:

  • Scatter-gather I/O

    Before scatter-gather I/O was introduced in Windows NT version 4.0 Service Pack 2, all of the data for a disk read or write on Windows NT had to be in a contiguous area of memory. If a read transferred in 64 KB of data, the read request had to specify the address of a contiguous area of 64 KB of memory. Scatter-gather I/O allows a read or write to transfer data in to or out of discontiguous areas of memory. Windows 2000 also supports scatter-gather I/O.

    If an instance of SQL Server 2000 reads in a 64 KB extent, it does not have to allocate a single 64 KB area and then copy the individual pages to buffer cache pages. It can locate eight buffer pages, and then do a single scatter-gather I/O specifying the address of the eight buffer pages. Windows NT or Windows 2000 places the eight pages directly into the buffer pages, eliminating the need for the instance of SQL Server to do a separate memory copy.

  • Asynchronous I/O

    In an asynchronous I/O, an application requests a read or write operation from Windows NT or Windows 2000. Windows NT or Windows 2000 immediately returns control to the application. The application can then perform additional work, and later test to see if the read or write has completed. By contrast, in a synchronous I/O, the operating system does not return control to the application until the read or write completes. Using asynchronous I/O allows instances of SQL Server to maximize the work done by individual threads while they are processing a batch.

SQL Server supports multiple concurrent asynchronous I/O operations against each file. SQL Server 2000 dynamically determines the maximum number of I/O operations an instance can issue for any file.