SpeechRecoContext SoundStart Event

Microsoft Speech SDK

Intelligent Interface Technologies Home Page Microsoft Speech SDK

Speech Automation 5.1

Interface: ISpeechRecoContext Events

SoundStart Event


The SoundStart event occurs when the SR engine encounters the start of sound in the audio input stream.

SoundStart indicates a sound level significant enough to be a voice. When that sound stops, a SoundEnd event is generated. A recognition attempt occurs only after a SoundEnd event; hence, long continuous speaking periods may take an equally long time to process.

Light background noise will not register as an input sound. Likewise a loud noise will be considered the start of an input sound. If the sound is constant, a time-out occurs sending a SoundEnd event.


SpeechRecoContext.SoundStart(
     StreamNumber As Long,
     StreamPosition As Variant
)

Parameters

StreamNumber
Specifies the stream number.
StreamPosition
Specifies the position within the stream. If downsampling an audio stream, StreamPosition will be the byte position within the converted stream.

Remarks

For speech processing, the SR engine must perform the following sequence: Stream start, sound start and phrase start. A stream start indicates a valid stream is ready for audio input. The stream persists unless the recognition context is disabled or the associated grammar is deactivated. The sound start indicates a sound level has been detected. However, it is possible the SR engine could stop that recognition attempt if the input sound were questionable. For example, if the sound were a constant level or if above or below pre-determined sound levels. If the sound level is acceptable and variable, a phrase start is initiated and it is assumed to be the beginning of a recognition attempt.

Example

The following Visual Basic form code demonstrates the use of the SoundStart and SoundEnd events. The application displays a stream number and notifications that a sound has begun or ended. It also displays a successful recognition

To run this code, create a form with the following controls:

  • Two labels called Label1 and Label2
  • Paste this code into the Declarations section of the form.

    The Form_Load procedure creates and activates a dictation grammar.

    Public WithEvents RC As SpSharedRecoContext
    Public myGrammar As ISpeechRecoGrammar
    
    Private Sub Form_Load()
        Set RC = New SpSharedRecoContext
    	
        Set myGrammar = RC.CreateGrammar
        myGrammar.DictationSetState SGDSActive
    End Sub
    
    Private Sub RC_Recognition(ByVal StreamNumber As Long, ByVal StreamPosition As Variant, ByVal RecognitionType As SpeechLib.SpeechRecognitionType, ByVal Result As SpeechLib.ISpeechRecoResult)
        Label1.Caption = Result.PhraseInfo.GetText
    End Sub
    
    Private Sub RC_SoundEnd(ByVal StreamNumber As Long, ByVal StreamPosition As Variant)
        Label2.Caption = "Sound end at position: " & StreamPosition
    End Sub
    
    Private Sub RC_SoundStart(ByVal StreamNumber As Long, ByVal StreamPosition As Variant)
        Label2.Caption = "Sound start"
    End Sub