Initialize and Manage a Speech Recognition Engine (Microsoft.Speech)

Microsoft Speech Platform SDK 11

Collapse image Expand Image Copy image CopyHover image

You can use the SpeechRecognitionEngine class to manage any speech recognition engine that is installed on the system. You will typically use a SpeechRecognitionEngine object to manage an installed speech recognition engine as follows:

  • Select and initialize an engine to use for speech recognition.

  • Configure and monitor the input to the speech recognition engine.

  • Configure parameters for recognition.

  • Register for notification of events and handle the events.

  • Load and unload speech recognition grammars.

  • Start, pause, and stop recognition operations.

Select and Initialize a Speech Recognition Engine

You can use the parameters provided by constructors of the SpeechRecognitionEngine class to select a an installed speech recognition engine that matches specific criteria, such as language-culture, the recognizer name, and other attributes.

Configure and Monitor the Input

You can configure the input to the SpeechRecognitionEngine to receive audio in a Wave stream, a Wave file, or an audio stream. See Audio Input for Recognition (Microsoft.Speech) for more information.

When the SpeechRecognitionEngine is receiving audio, you can monitor the incoming signal by querying the AudioState and AudioLevel properties and by registering a handler for the AudioSignalProblemOccurred event.

Configure Parameters for Recognition

To fine-tune how the recognizer responds to background noise and silence that accompanies speech input, set the values of the BabbleTimeout, EndSilenceTimeout, and EndSilenceTimeoutAmbiguous properties.

Speech recognition operations produce multiple recognition result candidates, evaluate the accuracy of each result candidate with respect to the spoken input, and return the recognition candidate that most likely matches the received speech. You can control the number of alternate recognition results that the speech engine returns by setting the MaxAlternates property. You can also query the settings of a speech recognition engine that affect recognition, such as confidence thresholds, using the QueryRecognizerSetting(String) method and modify those settings with one of the UpdateRecognizerSetting()()()() methods.

Register for Events and Author Handlers

The SpeechRecognitionEngine automatically raises events that return information to your application about the incoming signal, loading grammars, detecting speech, preliminary recognition results, final recognition results, and the end of a recognition operation. Your application can stay informed of the status and progress of recognition operations by registering for the SpeechRecognitionEngine's events. You can author code in the handlers for the events that creates an appropriate response by your application when events are received. See Use Speech Recognition Events (Microsoft.Speech).

The SpeechHypothesized, SpeechRecognized, SpeechRecognitionRejected, RecognizeCompleted events all return a RecognitionResult object that contains detailed information about the results of recognition. This information includes the Text of the recognized word or phrase, the Semantics associated with the recognition, the Confidence score assigned by the speech recognition engine, as well as other information.

Load and Unload Grammars

Load a Grammar object using the LoadGrammar(Grammar) or LoadGrammarAsync(Grammar) methods. You can unload a specific Grammar object using the UnloadGrammar(Grammar) method, or unload all currently loaded Grammar objects with a call to UnloadAllGrammars()()()(). If the SpeechRecognitionEngine is running, you can use one of the RequestRecognizerUpdate()()()() methods to pause it before loading or unloading Grammar objects.

Start, Pause, and Stop Recognition

To start recognition operation, use one of the Recognize()()()() or RecognizeAsync()()()() methods.

To stop an asynchronous recognition operation, use the RecognizeAsyncCancel()()()() or RecognizeAsyncStop()()()() methods.

You can pause a running SpeechRecognitionEngine instance to update its configuration or to load and unload grammars using one of the RequestRecognizerUpdate()()()() methods.

The SpeechRecognitionEngine can perform an additional mode of recognition (called emulation) during which it accepts text, rather than speech, as input. Emulated recognition can be useful for debugging grammars. The speech recognizer raises the SpeechDetected, SpeechHypothesized, SpeechRecognitionRejected, and SpeechRecognized events as if the recognition operation is not emulated. To initiate emulated recognition, call one of the EmulateRecognize()()()() or EmulateRecognizeAsync()()()() methods and pass in text or an array of words for which you want to perform emulated recognition.