Training Characters

NI OCR Training Interface

Training Characters

Use the Train/Read tab to open images and train characters, and save the character values to a character set file.

Complete the following steps to train characters:

  1. Access the OCR Training Interface.
  2. Click File»Open Images, and select the image or images you want to use for training. You can select multiple image files by pressing the <Ctrl> key and clicking each file. You can also enable the Select all files checkbox to open all images in the directory you specified.
  3. Click Open.
  4. Use the navigation buttons to locate the image you want to use for training.
  5. On the image, draw an ROI around the characters you want to train.
    Tip  You can modify the view of the image if necessary.

    OCR segments objects in the ROI, drawing character bounding rectangles around them, according to the settings on each of the tabs at the bottom of the training interface. Text Read displays recognized characters and the substitution character based on the character set file you are using. If you have not opened a character set file, Text Read displays the substitution character for each of the segmented objects in the ROI. For example, if the ROI contains three segmented objects, Text Read contains three substitution characters. Any object that is surrounded by a character bounding rectangle is a segmented object.

    You can specify the substitution character in the Read Options tab.

  6. Use the Threshold, Advanced Threshold, and Size & Spacing tabs to set up the parameters you want to use in the training process. Adjust threshold methods and settings and make changes in the other tabs to configure OCR to draw character bounding rectangles around objects in the ROI appropriately. OCR displays segmented objects in blue.
  7. Click Train All Characters.
  8. In Correct String, enter the character values you want to associate with each of the objects in the ROI you specified. The number of character values you enter must match the number of objects found in the ROI.
  9. Click Train.
  10. If Text Read displays incorrect characters, adjust the parameters on the Threshold, Advanced Threshold, and Size & Spacing tabs to improve the results.
  11. Repeat steps 3 through 10 to train additional characters.
  12. Click File»Save Character Set File, enter a name for the character set file, and click Save.
  13. Use the navigation buttons to view each image. Ensure that Text Read displays the correct characters for each image. If Text Read displays the substitution character or an incorrect character, adjust the parameters on the Read Options tab and/or train the incorrect characters, and then save the character set file.

Training Incorrect Characters

Train incorrect characters when the ROI you draw contains characters you already trained. For example, if you analyze an image that contains the letters A, P, and R and you train these letters, saving them to a character set file, you can later use the character set file to train characters on another image. If the second image contains the letters A, P, R, and O, OCR displays the recognized characters, A, P, and R, and the substitution character in Text Read.

You also train incorrect characters when OCR displays the wrong character value for a segmented object. For example, if an ROI contains the letters A, P, R, and O and Text Read includes the letter B instead of P, you use the Train Incorrect Characters option to correctly train the letter P.

Complete the following steps to train incorrect characters:

  1. Access the OCR Training Interface.
  2. Click File»Open Images, and select the image or images you want to use for training. You can select multiple image files by pressing the <Ctrl> key and clicking each file. You can also enable the Select all files checkbox to open all images in the directory you specified.
  3. Click Open.
  4. Use the navigation buttons to locate the image you want to use for training.
  5. On the image, draw an ROI around the characters you want to train.

    OCR segments objects in the ROI, drawing character bounding rectangles around them, according to the settings on each of the tabs at the bottom of the training interface. Text Read displays recognized characters and the substitution character based on the character set file you are using. If you have not opened a character set file, Text Read displays the substitution character for each of the segmented objects in the ROI. For example, if the ROI contains three segmented objects, Text Read contains three substitution characters. Any object that is surrounded by a character bounding rectangle is a segmented object.

  6. Use the Threshold, Advanced Threshold, and Size & Spacing tabs to set up the parameters you want to use in the training process. Adjust threshold methods and settings and make changes in the other tabs to configure OCR to draw character bounding rectangles around objects in the ROI appropriately. OCR displays objects in blue.
  7. Click Train Incorrect Characters.
  8. Enter the appropriate character values in Correct String, including the previously trained and recognized characters, and click Train.

    Although you must enter character values for all segmented objects in Correct String, OCR trains only objects that do not have a match or have an incorrect match in Text Read.

Training Single Characters or Patterns

Complete the following steps to train single characters or patterns:

  1. Access the OCR Training Interface.
  2. Click File»Open Images, and select the image or images you want to use for training. You can select multiple image files by pressing the <Ctrl> key and clicking each file. You can also enable the Select all files checkbox to open all images in the directory you specified.
  3. Click Open.
  4. Use the navigation buttons to locate the image you want to use for training.
  5. On the image, draw an ROI around the characters you want to train.

    OCR segments objects in the ROI, drawing character bounding rectangles around them, according to the settings on each of the tabs at the bottom of the training interface. Text Read displays recognized characters and the substitution character based on the character set file you are using. If you have not opened a character set file, Text Read displays the substitution character for each of the segmented objects in the ROI. For example, if the ROI contains three segmented objects, Text Read contains three substitution characters. Any object that is surrounded by a character bounding rectangle is a segmented object.

  6. Use the Threshold, Advanced Threshold, and Size & Spacing tabs to set up the parameters you want to use in the training process. Adjust threshold methods and settings and make changes in the other tabs to configure OCR to draw character bounding rectangles around objects in the ROI appropriately. OCR displays objects in blue.
  7. Click Train Single Character.
  8. Select the Index of the character you want to train. OCR displays the character bounding rectangle of the corresponding character with a different color.
  9. Enter the appropriate character value in Correct String.
  10. Click Train.