Using Sample Audio Object (SpAudioPlug)

Microsoft Speech SDK

The Microsoft.com Speech website Microsoft Speech SDK SAPI 5.1

Using Sample Audio Object (SpAudioPlug)

Overview

This paper presents a overview to assist writing a custom audio object. It is intended to be used with the SAPI 5.1 SDK sample Visual Basic Audio Application. That application allows a user to enters the text in a edit box and perform speech recognition dictation on that text. A custom audio object is used to replace the traditional speaker. Rather than having the text talk over the speakers, the voice is redirected through the custom audio object to the speech recognition (SR) engine. At that point the voice is attempted to be recognized.

Implementation

The Visual Basic example VB Audio Application uses the automation interface provided by the sample audio object to do the audio data management. The application creates two instances of sample audio objects. One audio object is for text-to-speech (TTS) output, the other one is for SR input. The application would route the audio data from TTS output to SR input.

Additional custom audio processing is available in the SAPI 5.1 SDK samples VB SAPI with Internet and VB Outgoing Call. The white paper Speech Telephony Application Guide. VB SAPI with Internet provides implementation details.

Set up the TTS output

For TTS, we create an instance of the sample audio object and set to write mode.

Set AudioPlugOut = New SpAudioPlug
AudioPlugOut.Init True, AUDIOFORMAT

Then the Voice's output is set to point to this audio object.

Set Voice = New SpVoice
Set Voice.AudioOutputStream = AudioPlugOut

Set up the SR input

For SR, we create an instance of the sample audio object and set to read mode.

Set AudioPlugIn = New SpAudioPlug
AudioPlugIn.Init False, AUDIOFORMAT

Then the Recognizer's input is set to point to this audio object.

Set Recognizer.AudioInputStream = AudioPlugIn

Start processing

The following code starts the TTS and SR processes, and routes audio data from TTS output to SR input

output = AudioPlugOut.GetData
'Output the audio data to the input audio object
   If (Len(output) * 2 <> 0) Then
      AudioPlugIn.SetData (output)
   End If