Physical Database Files and Filegroups

SQL Server Architecture

SQL Server Architecture

Physical Database Files and Filegroups

Microsoft® SQL Server™ 2000 maps a database over a set of operating-system files. Data and log information are never mixed on the same file, and individual files are used only by one database.

SQL Server 2000 databases have three types of files:

  • Primary data files

    The primary data file is the starting point of the database and points to the other files in the database. Every database has one primary data file. The recommended file name extension for primary data files is .mdf.

  • Secondary data files

    Secondary data files comprise all of the data files other than the primary data file. Some databases may not have any secondary data files, while others have multiple secondary data files. The recommended file name extension for secondary data files is .ndf.

  • Log files

    Log files hold all of the log information used to recover the database. There must be at least one log file for each database, although there can be more than one. The recommended file name extension for log files is .ldf.

SQL Server 2000 does not enforce the .mdf, .ndf, and .ldf file name extensions, but these extensions are recommended to help identify the use of the file.

In SQL Server 2000, the locations of all the files in a database are recorded in both the master database and the primary file for the database. Most of the time the database engine uses the file location information from the master database. For some operations, however, the database engine uses the file location information from the primary file to initialize the file location entries in the master database:

  • When attaching a database using the sp_attach_db system stored procedure.

  • When upgrading from SQL Server version 7.0 to SQL Server 2000.

  • When restoring the master database.

SQL Server 2000 files have two names:

  • logical_file_name is a name used to refer to the file in all Transact-SQL statements.

    The logical file name must conform to the rules for SQL Server identifiers and must be unique to the database.

  • os_file_name is the name of the physical file.

    It must follow the rules for Microsoft Windows NT® or Microsoft Windows® 98, and Microsoft Windows 95 file names.

These are examples of the logical file names and physical file names of a database created on a default instance of SQL Server 2000:

SQL Server data and log files can be placed on either FAT or NTFS file systems, but cannot be placed on compressed file systems.

Pages in a SQL Server 2000 data file are numbered sequentially starting with 0 for the first page in the file. Each file has a file ID number. Uniquely identifying a page in a database requires both the file ID and page number. The following example shows the page numbers in a database that has a 4-MB primary data file and a 1-MB secondary data file.

The first page in each file is a file header page containing information about the attributes of the file. Several of the other pages at the start of the file also contain system information, such as allocation maps. One of the system pages stored in both the primary data file and the first log file is a database boot page containing information about the attributes of the database.

SQL Server 2000 files can grow automatically from their originally specified size. When you define a file, you can specify a growth increment. Each time the file fills, it increases its size by the growth increment. If there are multiple files in a filegroup, they do not autogrow until all the files are full. Growth then occurs using a round-robin algorithm.

Each file can also have a maximum size specified. If a maximum size is not specified, the file can continue to grow until it has used all available space on the disk. This feature is especially useful when SQL Server is used as a database embedded in an application where the user does not have ready access to a system administrator. The user can let the files autogrow as needed to lessen the administrative burden of monitoring the amount of free space in the database and allocating additional space manually.

When multiple instances of SQL Server are run on a single computer, each instance gets a different default directory to hold the files for the databases created in the instance. For more information, see Directories and File Locations.

Database Filegroups

Database files can be grouped together in filegroups for allocation and administration purposes. Some systems can improve their performance by controlling the placement of data and indexes onto specific disk drives. Filegroups can aid this process. The system administrator can create filegroups for each disk drive, then assign specific tables, indexes, or the text, ntext, or image data from a table, to specific filegroups.

No file can be a member of more than one filegroup. Tables, indexes, and text, ntext, and image data can be associated with a filegroup, in which case all their pages will be allocated in that filegroup.

Log files are never a part of a filegroup. Log space is managed separately from data space.

Files in a filegroup will not autogrow unless there is no space available on any of the files in the filegroup.

There are two types of filegroups:

  • Primary

    The primary filegroup contains the primary data file and any other files not specifically assigned to another filegroup. All pages for the system tables are allocated in the primary filegroup.

  • User-defined

    User-defined filegroups are any filegroups specified using the FILEGROUP keyword in a CREATE DATABASE or ALTER DATABASE statement.

One filegroup in each database operates as the default filegroup. When SQL Server allocates a page to a table or index for which no filegroup was specified when they were created, the pages are allocated from the default filegroup. Only one filegroup at a time can be the default filegroup. Members of the db_owner fixed database role can switch the default filegroup from one filegroup to another. If no default filegroup is specified, the primary filegroup is the default filegroup.

SQL Server 2000 can work quite effectively without filegroups, so many systems will not need to specify user-defined filegroups. In this case, all files are included in the primary filegroup and SQL Server 2000 can allocate data anywhere in the database. Filegroups are not the only method that can be used to distribute I/O across multiple drives.

Members of the db_owner fixed database role can back up and restore individual files or filegroups instead of backing up or restoring an entire database.

The following example creates a database on a default instance of SQL Server 2000. The database has a primary data file, a user-defined filegroup, and a log file. The primary data file is in the primary filegroup and the user-defined filegroup has two secondary data files. An ALTER DATABASE statement makes the user-defined filegroup the default. A table is then created specifying the user-defined filegroup.

USE master
GO
-- Create the database with the default data
-- filegroup and the log file. Specify the
-- growth increment and the max size for the
-- primary data file.
CREATE DATABASE MyDB
ON PRIMARY
  ( NAME='MyDB_Primary',
   FILE NAME=
      'c:\Program Files\Microsoft SQL Server\MSSQL\data\MyDB_Prm.mdf',
   SIZE=4,
   MAXSIZE=10,
   FILEGROWTH=1),
FILEGROUP MyDB_FG1
  ( NAME = 'MyDB_FG1_Dat1',
   FILE NAME =
      'c:\Program Files\Microsoft SQL Server\MSSQL\data\MyDB_FG1_1.ndf',
   SIZE = 1MB,
   MAXSIZE=10,
   FILEGROWTH=1),
  ( NAME = 'MyDB_FG1_Dat2',
   FILE NAME =
      'c:\Program Files\Microsoft SQL Server\MSSQL\data\MyDB_FG1_2.ndf',
   SIZE = 1MB,
   MAXSIZE=10,
   FILEGROWTH=1)
LOG ON
  ( NAME='MyDB_log',
   FILE NAME =
      'c:\Program Files\Microsoft SQL Server\MSSQL\data\MyDB.ldf',
   SIZE=1,
   MAXSIZE=10,
   FILEGROWTH=1)
GO
ALTER DATABASE MyDB 
MODIFY FILEGROUP MyDB_FG1 DEFAULT
GO

-- Create a table in the user-defined filegroup.
USE MyDB
CREATE TABLE MyTable
  ( cola      int   PRIMARY KEY,
   colb      char(8) )
ON MyDB_FG1
GO

User filegroups can be made read-only. The data cannot be altered, but the catalog can still be modified to allow work such as permissions management.

SQL Server 2000 databases can be detached from a server and reattached to either another server or the same server. This is especially useful in making databases distributed for use on a customer's local SQL Server installation. For example, a company could create a database containing their current product catalog. The company could create this database on a writable compact disc drive and make the database read-only. They could then copy the compact disc and send copies to all of their field sales representatives equipped with a catalog application and SQL Server on Windows 95 laptops. The sales representatives would then have the latest catalog information.