About optical character recognition (OCR)

Microsoft Office Document Imaging

Optical character recognition (OCR) translates images of text, such as scanned documents, into actual text characters. Also known as text recognition, OCR makes it possible to edit and reuse the text that is normally locked inside scanned images. OCR works using a form of artificial intelligence known as pattern recognition to identify individual text characters on a page, including punctuation marks, spaces, and ends of lines.

OCR can be performed in three scenarios:

  • Automatic OCR runs automatically each time you perform a new scan, unless you change the scanning presets.
  • Manual Run OCR manually for documents that were scanned using another program.
  • Indexing Indexing is a system service that helps you to quickly find files on your computer using text searches. When you perform OCR on Tagged Image File Format (TIFF) or Microsoft Document Imaging Format (MDI) files, recognized text is available to the index, making it possible to find relevant TIFF and MDI files when you search. You can index any and all TIFF and MDI files on your computer.

ShowSetting OCR options

ShowOptimizing OCR accuracy