About optical character recognition (OCR)

Microsoft Office Document Imaging

Show All

About optical character recognition (OCR)

Optical character recognition (OCR) translates images of text, such as scanned documents, into actual text characters. Also known as text recognition, OCR makes it possible to edit and reuse the text that is normally locked inside scanned images. OCR works using a form of artificial intelligence known as pattern recognition to identify individual text characters on a page, including punctuation marks, spaces, and ends of lines.

OCR can be performed in three scenarios:

  • Automatic   OCR runs automatically each time you perform a new scan, unless you change the scanning presets.
  • Manual   Run OCR manually for documents that were scanned using another program.
  • Indexing   Indexing is a system service that helps you to quickly find files on your computer using text searches. When you perform OCR on Tagged Image File Format (TIFF) files, recognized text is available to the index, making it possible to find relevant TIFF files when you search. You can index any and all TIFF files on your computer.

Setting OCR options

Optimizing OCR accuracy