About indexing text in TIFF files

Microsoft Office Document Imaging

Show All

About indexing text in TIFF files

When Microsoft Office Document Imaging performs optical character recognition (OCR) on a scanned document, the text is stored within a Tagged Image File Format (TIFF) file when you save it.

This text is available not only when you open the document in Office Document Imaging, but also when you search for files using other Microsoft Office programs, or when you use Microsoft Windows search features.

The indexing service

Indexing is a special service that enables fast file searches on your computer. Text found in files on your computer is added to the index, which also stores a reference to the file where the text was found.

Without indexing, you can search for words only in files that have had OCR performed on them — either automatically at scan time, or manually from the File menu. With indexing, you can search for any TIFF file based on the words it contains.

The indexing service is implemented in different ways, depending on the operating system you are using.

Windows 2000

Windows 98 and Windows Millennium

If the indexing service is turned off, only the following text is available in TIFF files that have not had OCR performed on them: the file name itself and any file properties that might be available.

Indexing TIFF files without embedded OCR information

OCR is automatically performed on other TIFF files that exist on your computer, making the text available to the indexing service for file searches. In this case, the OCR text is stored only in the index, not within the TIFF files. This process takes several seconds per TIFF file encountered.

To turn off automatic indexing, click Options on the View menu, and then click Indexing Service. In the Indexing Service dialog box, clear the Use OCR to recognize the text in TIFF files when indexing check box.

Indexing in other languages

If you want to index documents in languages other than your computer's default language, you can select a dictionary from the OCR Language list in the Indexing Service dialog box.