RecoVB for Visual Basic

Microsoft Speech SDK

Intelligent Interface Technologies Home Page Microsoft Speech SDK

Speech Automation 5.1

RecoVB for Visual Basic

Introduction

RecoVB is an application demonstrating basic speech recognition (SR) techniques. It displays the following information associated with the SR process:

  • Text of the recognition
  • Associated phrase elements and information
  • Events for the recognition recorded as they are initiated and processed
  • Event interests for the SR attempt
  • Grammars to be used
  • Control of SR engine
  • You can use RecoVB to test SR processes and see the results of the attempts.

    To use RecoVB, select the characteristics of the recognition context. This includes the recognition type (command and control or dictation), the engine type to create (shared or InProc). By default, this configuration is set to command and control (C and C) in a shared environment. The default grammar is sol.xml. Once these parameters are set, click Start Recognition and speak into the microphone. The text appears on the screen as the speech is processed. If you selected any events or stream information, these will appear in the events display at the bottom of the main window. Once a recognition occurs, the larger Recognition window displays the results. Each recognition will be a new line in a tree structure. The result will display as Recognition and the actual word or phrase recognition.

    To see information associated with a specific recognition, open the tree view for that item by clicking on the line of the recognition or the small box to the left of the recognition. Like other tree view displays, if the box has a small plus sign ("+") in it, there is additional information to display. The expanded display lists the recognition result of that recognition.

    Options

    There are numerous options for RecoVB.

    Start Recognition

    Starts the SR process. Use Activate Mic to turn on the microphone, then you may speak into it. After starting SR, the button label will change to Stop Recognition. While active, the Recognition Type and Engine Creation radio buttons will be inactive since they cannot change during an SR session.

    Activate Mic

    Controls the microphone status. If selected, the microphone is active and receives sound for processing. If not selected, no sound will be processed through the microphone. While an inactive microphone does not process sound for SR, this is not the preferred method to turn SR on or off. Activate mic will be active only during speech recognition. The microphone may be turned off for brief periods but the SAPI engine is still active and consumes computer resources. To turn SR off, click Stop Recognition.

    Show Stream Info

    Displays the stream information associated with the SR session in the events list window.

    Recognition Type

    Controls the type of the recognition grammar. Select one of the following two types.

    C&C;

    A command and control (C and C) grammar recognizes specific words. By default, the sol.xml grammar is used as an example, although a different grammar may be selected using the Recognition menu->Load Grammar. A C and C grammar is intended to restrict the user to a set of words often associated with specific tasks such as selecting menu items or, in the case of the default grammar, playing a game of solitaire. The limited grammar results in a better quality of recognition for the words that the application needs to process. It also filters out unnecessary words.

    Dictation

    A dictation grammar imposes no restrictions on the words that may be recognized. Unlike a C and C grammar, you can say any word or phrase and the SR process will attempt to recognize it. This enables you to dictate a letter or memo, for instance.

    Engine Creation

    Controls how the engine is instantiated for the session. Select one of the following two types, shared or InProc.

    Shared

    A shared environment (also called context) allows the SR engine to be used by other applications concurrently. This is the more common of the two environments. See ISpeechRecognizer for additional details.

    Inproc

    An in-process or InProc environment restricts the SR engine to only one application. No other application may use that engine concurrently. See ISpeechRecognizer for additional details.

    Engine

    This drop-down box lists the engines available. Only one engine may used at a time and all SR instances must be of the same engine type.

    Emulate Recognition

    This option allows typed text to be processed by the SR engine. There are instances when users may wish to see the results and events associated with an SR attempt and will need method to replicate the speech attempt each time. Enter the text in the edit box and click Emulate to start the process.

    Emulate

    Click Emulate to start the emulated speech process. See Emulate Recognition for more details.

    Current C&C; Grammar

    Displays the current C and C grammar. The grammar may be changed through the Recognition menu Load Grammar item.

    Event Interests

    Use Event Interests to set which event interests to display and process. An event interest is a flag allowing SAPI or the SR engine to return information back to the application. When an event occurs (such as a recognition or the start of a new stream, for example), the SR engine can send a message back to the application. For example, for an application to display the text of a successful recognition, the application must receive the Recognition event interest. The application uses that message as a key before extracting the contents of the recognition. However, not all events are useful to the application at any one time. It is possible to prevent the application from receiving these events interests by turning off Event Interests. Likewise, they may be reactivated at anytime.

    To receive certain event interests, select the ones you want from the list box. By default, all are active except for Audio Level. To suppress receiving an event interest, clear the check box. See ISpeechRecoContextEvents for details about shared or InProc event interests.

    Clear Event List

    Clears the event interest window.

    Clear Tree View

    Clears the recognition results window.

    Play Audio

    Plays back the audio portion of the last recognition. This audio is the actual audio spoken by the user and is the sound sent to the SR engine. This is helpful in attempting to understand the results of a particular recognition. You must select Retain Audio in order to use this option.

    Retain Audio

    Keeps, or retains, the audio from the recognition attempt. See Play Audio for complete details.

    Exit

    Exits RecoVB. The application may also be exited by clicking the close box in the title bar or by the File menu Exit item.

    File Menu: Exit

    Exits RecoVB. The application may also be exited by clicking the close box in the title bar or by the File menu Exit item.

    Help Menu: About

    Displays the About box for the RecoVB.

    Compile

    RecoVB is a standard Visual Basic application and does not require special support. However, the Speech reference must be active; see Creating a Speech-Enabled Visual Basic Project in Using the Visual Basic Code Examples for details to speech enable Visual Basic applications. Additionally, the samples are installed as Locked files. To modify them, they must be unlocked. To unlock, right-click the file or files, select Properties, and clear the Read-Only check box.