Full-Text Search Upgrade

SQL Server Setup

Upgrading full-text search to SQL Server 2008 is done during setup and when database files and full-text catalogs from a previous version of SQL Server are attached, restored, or copied using the Copy Database Wizard.

This topic discusses the following aspects of full-text upgrade:

Upgrading a Server Instance

For an in-place upgrade, an instance of SQL Server 2008 is set up side-by-side with the old version of SQL Server, and data is migrated. If the old version of SQL Server had full-text search installed, a new version of full-text search is automatically installed. Side-by-side install means that each of the following components exists at the instance-level of SQL Server.

Word breakers, stemmers, and filters

Each instance now uses its own set of word breakers, stemmers, and filters, rather than relying on the operating system version of these components. These components are also easier to register and configure at a per-instance level. For more information, see Word Breakers and Stemmers and Full-Text Search Filters.

Filter daemon host

The full-text filter daemon hosts are processes that safely load and drive third-party extensible components used for index and query, such as word breakers, stemmers, and filters, without compromising the integrity of the Full-Text Engine. A server instance uses a multithreaded process for all multithreaded filters and a single-threaded process for all single-threaded filters.

Note:
SQL Server 2008 introduces a service account for the FDHOST Launcher service (MSSQLFDLauncher). This service propagates the service account information to the filter daemon host processes of a specific instance of SQL Server. For information about setting the service account, see How to: Set the FDHOST Launcher (MSSQLFDLauncher) Service Account for Full-Text Search (SQL Server Configuration Manager).

In SQL Server 2005 and earlier versions, each full-text index resides in a full-text catalog that belongs to a filegroup, has a physical path, and is treated as a database file. In SQL Server 2008, a full-text catalog is a logical concept—a virtual object—that refers to a group of full-text indexes. Therefore, a new full-text catalog is not treated as a database file with a physical path. However, during upgrade of any full-text catalog that contains data files, a new filegroup is created on same disk. This maintains the old disk I/O behavior after upgrade. Any full-text index from that catalog is placed in the new filegroup if the root path exists. If the old full-text catalog path is invalid, the upgrade keeps the full-text index in the same filegroup as base table or, for a partitioned table, in the primary filegroup.

For more information about the architecture of full-text search in SQL Server 2008, see Full-Text Search Architecture.

Note:
SQL Server 2005 Transact-SQL DDL statements that specify full-text catalogs work correctly.

Full-Text Upgrade Options

When upgrading a server instance to SQL Server 2008, the user interface allows you to choose one of the following full-text upgrade options.

Import

Full-text catalogs are imported. Typically, import is significantly faster than rebuild. For example, when using only one CPU, import runs about 10 times faster than rebuild. However, an imported full-text catalog does not use the new and enhanced word breakers introduced in SQL Server 2008, so you might want to rebuild your full-text catalogs eventually.

Note:
Rebuild can run in multi-threaded mode, and if more than 10 CPUs are available, rebuild might run faster than import if you allow rebuild to use all of the CPUs.

If a full-text catalog is not available, the associated full-text indexes are rebuilt. This option is available for only SQL Server 2005 databases.

For information about the impact of importing full-text index, see "Considerations for Choosing a Full-Text Upgrade Option," later in this topic.

Rebuild

Full-text catalogs are rebuilt using the new and enhanced word breakers. Rebuilding indexes can take awhile, and a significant amount of CPU and memory might be required after the upgrade.

Reset

Full-text catalogs are reset. SQL Server 2005 full-text catalog files are removed, but the metadata for full-text catalogs and full-text indexes is retained. After being upgraded, all full-text indexes are disabled for change tracking and crawls are not started automatically. The catalog will remain empty until you manually issue a full population, after the upgrade completes.

Considerations for Choosing a Full-Text Upgrade Option

When choosing the upgrade option for your upgrade, consider the following:

  • How do you use word breakers?
    The SQL Server 2008 full-text search service includes new word breakers and stemmers. These might change the results of full-text queries from previous releases for a specific text pattern or scenario. Therefore, how you use word breakers is important when choosing a suitable upgrade option:
    • If the word breakers of the full-text language you use did not change in SQL Server 2008, or if recall accuracy is not critical to you, importing is suitable. Later, if you experience any recall issues, you can upgrade to the new word breakers simply by rebuilding your full-text catalogs. For more information, see "Word Breakers and Imported Full-text indexes," later in this section.
    • If you care about recall accuracy and you use one of the word breakers that have been improved in SQL Server 2008, rebuilding is suitable.
  • Were any full-text indexes built on integer full-text key columns?
    Rebuilding performs internal optimizations that improve the query performance of the upgraded full-text index in some cases. Specifically, if you have full-text catalogs that contain full-text indexes for which the full-text key column of the base table is an integer data type, rebuilding achieves ideal performance of full-text queries after upgrade. In this case, we highly recommend using the Rebuild option.
    Note:
    For full-text indexes in SQL Server 2008, we recommend that the column serving as the full-text key be an integer data type. For more information, see Performance Tuning and Optimization (Full-Text Search).

  • What is the priority for getting your server instance online?
    Importing or rebuilding during upgrade takes a lot of CPU resources, which delays getting the rest of the server instance upgraded and online. If getting the server instance online as soon as possible is important, and if you are willing to run a manual population after the upgrade, Reset is suitable.

Using new Word Breakers After Importing a SQL Server 2005 Full-Text Index

When importing an SQL Server 2005 full-text index, it is important to realize that SQL Server 2008 includes new word breakers for many of the languages that exist in SQL Server 2005. Only the word breakers for English, Korean, Thai, and Chinese (all forms) remain the same. For other languages, SQL Server 2008 introduces a new generation of word breakers, which potentially might behave slightly differently from SQL Server 2005 word breakers in imported SQL Server 2005 full-text indexes.

If a full-text catalog was imported when a SQL Server 2005 database was upgraded to SQL Server 2008, one or more languages used by the full-text indexes in full-text catalog might now be associated with new word breakers. For those languages, when a query uses the new word breakers, mismatches between the query and the full-text index content might occur occasionally because of slight differences in the behavior of the old and new word breakers. In this case, to guarantee a total match between queries and the full-text index content, either:

  • Rebuild the full-text catalog that contains the full-text index (ALTER FULLTEXT CATALOG catalog_name REBUILD).
  • Issue a FULL POPULATION on the full-text index (ALTER FULLTEXT INDEX ON table_name START FULL POPULATION).
Important:
If your full-text catalogs and indexes are using unchanged word breakers, such as just the English word breakers, for instance, no mismatch can occur. The word breakers used at query times will be the ones used when the index was created at indexing and future query times.

For information about word breakers, see Word Breakers and Stemmers.

Upgrading Noise-Word Files to Stoplists

SQL Server 2005 noise words have been replaced by SQL Server 2008 stopwords. When a database is upgraded to SQL Server 2008 from a previous release, the noise-word files are no longer used in SQL Server 2008. However, the old noise-word files are stored in the FTDATA\ FTNoiseThresaurusBak folder, and you can use them later when updating or building the corresponding SQL Server 2008 stoplists.

After the upgrade:

  • If you never added, modified, or deleted any noise-word files in your installation of SQL Server 2005, the system stoplist should meet your needs.
  • If your noise-word files were modified in SQL Server 2005, those modifications are lost during upgrade. To re-create those updates, you must manually recreate those modifications in the corresponding SQL Server 2008 stoplist. For more information, see ALTER FULLTEXT STOPLIST (Transact-SQL).
  • If you do not want to apply any stopwords to your full-text indexes (for example, if you deleted or erased your noise-word files in your SQL Server 2005 installation), you must turn off the stoplist for each upgraded full-text index. Run the following Transact-SQL statement (replacing database with the name of the upgraded database and table with the name of the table):
    Copy Code
    Use database; 
    ALTER FULLTEXT INDEX ON table
       SET STOPLIST OFF;
    GO
    The STOPLIST OFF clause removes stop-word filtering, and it will trigger a population of the table, without filtering any words considered to be noise.

Back Up and Imported Full-Text Catalogs

For full-text catalogs that are rebuilt or reset during upgrade (and for new full-text catalogs), the fulltext catalog is a logical concept and does not reside in a filegroup. Therefore, to back up a full-text catalog in SQL Server 2008, you must identify every filegroup that contains a full-text index of the catalog and back them up, one by one. For more information, see Backing Up and Restoring a SQL Server 2008 Full-Text Catalog.

For full-text catalogs that have been imported from SQL Server 2005, the full-text catalog is still a database file in its own filegroup. The SQL Server 2005 backup process for full-text catalogs still applies except that the MSFTESQL service does not exist in SQL Server 2008. For information about the SQL Server 2005 process, see "Backing Up and Restoring Full-Text Catalogs" in SQL Server 2005 Books Online.

Note:
For information about backup and restore of a SQL Server 2008 database, see Backing Up and Restoring a SQL Server 2008 Full-Text Catalog.

Migrating Full-Text Indexes When Upgrading a Database to SQL Server 2008

Database files and full-text catalogs from a previous version of SQL Server can be upgraded to an existing SQL Server 2008 server instance by using attach, restore, or the Copy Database Wizard. SQL Server 2005 full-text indexes, if any, are either imported, reset, or rebuilt. SQL Server 2000 full-text indexes, if any, are rebuilt or reset. The upgrade_option server property controls which full-text upgrade option the server instance uses during these database upgrades.

After you attach, restore, or copy any SQL Server 2005 or SQL Server 2000 database to SQL Server 2008, the database becomes available immediately and is then automatically upgraded. Depending on the amount of data being indexed, importing can take several hours, and rebuilding can take up to ten times longer. Note also that when the upgrade option is set to import, if a full-text catalog is not available, the associated full-text indexes are rebuilt.

To change full-text upgrade behavior on a server instance

  • Transact-SQL: Use the upgrade_option action of sp_fulltext_service
  • SQL Server Management Studio : Use the Full-Text Upgrade Option of the Server Properties dialog box. For more information, see How to: View or Change Server Properties for Full-Text Search (SQL Server Management Studio).

Considerations for Restoring a SQL Server 2005 Full-Text Catalog to SQL Server 2008

One method of upgrading fulltext data from a SQL Server 2005 database to SQL Server 2008 is to restore a full database backup to SQL Server 2008.

While importing a catalog, you can back up and restore the database and the catalog file. The behavior is the same as in SQL Server 2005:

  • The full database backup will include the full-text catalog. To refer to the full-text catalog, use its SQL Server 2005 file name, sysft_+catalog-name.
  • If the full-text catalog is offline, the backup will fail.

For more information about backing up and restoring SQL Server 2005 full-text catalogs, see Backing Up and Restoring Full-Text Catalogs and File Backup and Restore and Full-Text Catalogs in SQL Server 2005 Books Online.

When the database is restored on SQL Server 2008, a new database file will be created for the full-text catalog. The default name of this file is ftrow_catalog-name.ndf. For example, if your catalog-name is cat1, the default name of the SQL Server 2008 database file would be ftrow_cat1.ndf. But if the default name is already being used in the target directory, the new database file would be named ftrow_catalog-name{GUID}.ndf, where GUID is the Globally Unique Identifier of the new file.

After the catalogs have been imported, the sys.database_files and sys.master_files are updated to remove the catalog entries and the path column in sys.fulltext_catalogs is set to NULL.

To back up a database

  • Full Database Backups
  • Creating Transaction Log Backups (full recovery model only)

To restore a database backup

  • Performing a Complete Database Restore (Simple Recovery Model)
  • Performing a Complete Database Restore (Full Recovery Model)
Note:
For information about restoring SQL Server 2008 full-text indexes, see Piecemeal Restore and Full-Text Indexes.

Example

The following example uses the MOVE clause in the RESTORE statement, to restore a SQL Server 2005 database named ftdb1. The SQL Server 2005 database, log, and catalog files are moved to new locations on the SQL Server 2008 server instance, as follows:

  • The database file, ftdb1.mdf, is moved to C:\Program Files\Microsoft SQL Server\MSSQL.1MSSQL10.MSSQLSERVER\MSSQL\DATA\ftdb1.mdf.
  • The log file, ftdb1_log.ldf, is moved to a log directory on your log disk drive, log_drive:\log_directory\ftdb1_log.ldf.
  • The catalog files that correspond to the sysft_cat90 catalog are moved to C:\temp. After the full-text indexes are imported, they will automatically be placed in a database file, C:\ftrow_sysft_cat90.ndf, and the C:\temp will be deleted.
Copy Code
RESTORE DATABASE [ftdb1] FROM  DISK = N'C:\temp\ftdb1.bak' WITH  FILE = 1,
   MOVE N'ftdb1' TO N'C:\Program Files\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSQL\DATA\ftdb1.mdf',
    MOVE N'ftdb1_log' TO N'log_drive:\log_directory\ftdb1_log.ldf',
    MOVE N'sysft_cat90' TO N'C:\temp';

Attaching a SQL Server 2005 database to SQL Server 2008

In SQL Server 2008, a full-text catalog is a logical concept that refers to a group of full-text indexes. The full-text catalog is a virtual object that does not belong to any filegroup. However, when you attach a SQL Server 2005 database that contains full-text catalog files onto a SQL Server 2008 server instance, the catalog files are attached from their previous location along with the other database files, the same as in SQL Server 2005.

The state of each attached full-text catalog on SQL Server 2008 is the same as when the database was detached from SQL Server 2005. If any full-text index population was suspended by the detach operation, the population is resumed on SQL Server 2008, and the full-text index becomes available for full-text search.

If SQL Server 2008 cannot find a full-text catalog file or if the full-text file was moved during the attach operation without specifying a new location, the behavior depends on the selected full-text upgrade option. If the full-text upgrade option is Import or Rebuild, the attached full-text catalog is rebuilt. If the full-text upgrade option is Reset, the attached full-text catalog is reset.

For more information about detaching and attaching a database, see Detaching and Attaching Databases, CREATE DATABASE (Transact-SQL), sp_attach_db, and sp_detach_db (Transact-SQL).

See Also

Other Resources

Getting Started with Full-Text Search
Full-Text Engine
Word Breakers and Stemmers
Full-Text Search Filters