Full-Text Indexing Support
This illustration shows the components that make up the full-text indexing support. These are the components involved in defining, creating, and populating full-text indexes.
Enabling databases and tables for full-text indexing, defining, and populating the indexes is specified using:
- SQL Server Enterprise Manager.
One of the nodes of a database tree in SQL Server Enterprise Manager is used to manage the full-text catalogs in the database.
- Applications using SQL Distributed Management Objects (SQL-DMO).
SQL-DMO has objects for managing full-text catalogs and indexes.
- Applications using Transact-SQL and a standard database API.
Transact-SQL has a set of system stored procedures for managing full-text catalogs and indexes.
The other components define and populate full-text indexes in this manner:
- A Microsoft® SQL Server™ 2000 database is enabled for full-text indexing.
- The full-text catalogs for the database are specified.
- Individual tables are enabled for full-text indexing and associated with a catalog.
- Individual columns in each table are added to the full-text index for the table. All the meta data information from Steps from 1 through 4 is stored in system tables in SQL Server databases.
- The full-text indexes for each table are activated on a table-by-table basis. When a full-text table index is activated, a start seed value is sent from an instance of SQL Server to the indexing service within the Microsoft Search service. The start seed value identifies the table involved in the full text index.
- Population is requested on either a catalog-by-catalog or table-by-table basis. Populating on a catalog basis allows you to populate multiple indexes in one operation; populating tables lets you populate specific indexes.
The population in Step 6 can take different forms:
- Full population
If a full population is requested for a full-text catalog, index entries are built for all the rows in all the tables covered by the catalog. If a full populates is requested for a table, index entries are built for all the rows in that table. A full population typically occurs when a catalog or index is first populated, the indexes can then be maintained using change tracking or incremental populations.
- Change tracking population
Maintains a record of the rows that have been modified in a system table, and propagates the changes to the full-text index. You start the change tracking by executing sp_fulltext_table and specify start_change_tracking for the @action parameter. When using change tracking, you also specify when the changes are taken from the history table and populated in the full-text index:
- Background
After starting change tracking with start_change_tracking, you can execute sp_fulltext_table specifying start_background_updateindex for the @action parameter. With this option, changes to rows in the table are propagated to the full-text index as they occur.
- On demand
In this option, all tracked changes are stored in the history, and only propagated to the full-text index when you execute sp_fulltext_table specifying update_index for the @action parameter.
- Scheduled
You can use SQL Agent to schedule periodic jobs that execute sp_fulltext_table specifying update_index for the @action parameter. This will propagate all outstanding tracked changes to the index.
- Background
- Incremental population
Only adjusts index entries for rows added, deleted, or modified after the last population. This feature requires that the indexed table have a column of the timestamp data type. If the table does not have a timestamp column, only full or change tracking populations can be performed. Requests for incremental populations on tables without timestamp columns result in a full population operation.
If a new full-text index is defined for a table not associated with the catalog before, the next catalog-level incremental population request builds all the entries for the table.
Incremental population requests are implemented as full populations if any of the meta data for the table has changed since the last population. This includes altering any column, index, or full-text index definitions.
Each population request is sent to the indexing service within the Microsoft Search service:
- The indexing service passes the appropriate start seed value to the SQL Server Handler. The start seed value contains information such as the table and index involved in the population, and the timestamp value (if the table has a timestamp column) associated with the last full or incremental population performed for the index.
- The SQL Server Handler is a driver containing logic to extract text data from the SQL Server columns involved in a full-text index. The Handler retrieves the data from SQL Server and passes it back to the index service. For a full population, the SQL Server Handler extracts all the rows in the table. For an incremental population, the SQL Server Handler only extracts information from rows whose current timestamp values are higher than the timestamp associated with the last population, which is stored in the start seed.
- The indexing service then passes an index identifier and the strings to be indexed to the index engine. The index engine eliminates noise words such as a, and, or the. It also determines the word boundaries and builds a full-text index covering the words passed down from the indexing service. This linguistic analysis differs depending on the language in which the text is written. SQL Server 2000 supports linguistic analysis for several languages; the language is specified using sp_fulltext_column. The full-text index is stored in the full-text catalog file.
- At the end of the population, the indexing service calculates a new start seed value that records the point at which a subsequent incremental population should start.