Simple TTS Guide – Speak to a File and Speak a File

Microsoft Speech SDK

Intelligent Interface Technologies Home Page Microsoft Speech SDK Speech Automation 5.1

Simple TTS Guide – Speak to a File and Speak a File

Overview

This document is intended to help developers of text-to-speech (TTS) applications use SAPI TTS functionality to speak text into a wav file and to speak a text file. The example illustrates how to use the Speak and SpeakStream methods, how to select a specific voice, and how to set the output audio stream to a wav file. The examples are written in both C++ and Visual Basic.

Speak to a wav file in C++

The following is an example that speaks a text string, “Hello World”, to a wav file, “ttstemp.wav” in C++/ATL COM. The SAPI helper class, CSpStreamFormat, and helper method, SPBindToFile, which are defined in sphelper.h, are used in this example to set the audio wav format and bind the audio stream to the specific file. Since SPSF_22kHz16BitMono is the preferred wav format of the Microsoft English TTS engine, it is selected as the output audio format for the better audio effect. In the following example, the ISpVoice::SetOutput() method must be called to set the audio outputs to the right stream. This is because, by default, the output is set to the default audio device. For the simplification, ISpVoice::Speak() is called synchronously. If you want to speak asynchronously, change the speak flag to SPF_ASYNC and call ISpVoice::WaitUntilDone() after the ISpVoice::Speak() waiting for the completion of the speak process.


	HRESULT				hr = S_OK;
	CComPtr <ISpVoice>		cpVoice;
	CComPtr <ISpStream>		cpStream;
	CSpStreamFormat			cAudioFmt;

	//Create a SAPI Voice
	hr = cpVoice.CoCreateInstance( CLSID_SpVoice );

	//Set the audio format
        if(SUCCEEDED(hr))
	{
		hr = cAudioFmt.AssignFormat(SPSF_22kHz16BitMono);
	}
	
	//Call SPBindToFile, a SAPI helper method,  to bind the audio stream to the file
	if(SUCCEEDED(hr))
	{

		hr = SPBindToFile( L”c:\\ttstemp.wav”,  SPFM_CREATE_ALWAYS, 
			&cpStream;, & cAudioFmt.FormatId(),cAudioFmt.WaveFormatExPtr() );
	}
	
	//set the output to cpStream so that the output audio data will be stored in cpStream
        if(SUCCEEDED(hr))
	{
		hr = cpVoice->SetOutput( cpStream, TRUE );
	}

 	//Speak the text “hello world” synchronously
        if(SUCCEEDED(hr))
	{
		hr = cpVoice->Speak( L"Hello World",  SPF_DEFAULT, NULL );
	}
	
	//close the stream
	if(SUCCEEDED(hr))
	{
		hr = cpStream->Close();
	}

	//Release the stream and voice object
	cpStream.Release ();
	cpVoice.Release();

Speak to a wav file in automation

The following example is written in Visual Basic. It has the same functionality as the above in C++. After the creation of an SpFileStream object, a default format, SAFT22kHz16BitMono, is assigned to the object so that user does not need to explicitly assign a wav format to it unless a specific wav format is needed. In this example, ISpeechFileStream.Open creates a wav file, ttstemp.wav, and binds the FileStream to the file. The third parameter of ISpeechFileStream.Open is the Boolean, DoEvents. The default of this parameter is set to False. However, the user should always set it to True to display SAPI events while playing back the wav file. If the parameter is set to False, no engine events will be stored in the file, resulting in that no engine events will be fired during the wav file play back.


Dim FileName As String
Dim FileStream As New SpFileStream
Dim Voice As  SpVoice

'Create a  SAPI voice
Set Voice = New SpVoice
	
'The output audio data will be saved to ttstemp.wav file
FileName = “c:\ttstemp.wav"
	
'Create a file; set DoEvents=True so TTS events will be saved to the file
FileStream.Open FileName, SSFMCreateForWrite, True

'Set the output to the FileStream
Set Voice.AudioOutputStream = FileStream
	
'Speak the text
Voice.Speak “hello world”

'Close the Stream
FileStream.Close

'Release the objects
Set FileStream = Nothing
Set Voice = Nothing

Speak a Text File in C++

The following code snippet demonstrates how to speak a text file with a specific voice. In the example, SpEnumTokens, a SAPI helper method, is used to enumerate available voice tokens under the key: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\Voices\Tokens. SpEnumTokens returns a token enumerator containing all tokens meeting a set of required and optional attributes. Tokens in the enumerator are sorted in the order of “best matches” rule. In the following example, the required voice attribute is “Name=Microsoft Sam” and there are no optional attributes. SpEnumTokens here will return all of the voice tokens with “Name=Microsoft Sam” voice attribute. Through IEnumSpObjectTokens ::Next() method, you can find the best voice token and then set it as a current voice by calling ISpVoice::SetVoice() method. Now the “Microsoft Sam” voice has been chosen as a current voice.

Since the voice is speaking text from a file, the ISpVoice::Speak call, a speech flag, SPF_IS_FILENAME, must be set. Please note, you may choose to use ISpVoice:: SpeakStream to speak a file. In that case, you need to call SPBindToFile, a helper function, to bind the text file to an ISpStream object, and then call ISpVoice::SpeakStream.


	HRESULT				hr = S_OK;
	CComPtr <ISpVoice>		cpVoice;
	CComPtr <ISpObjectToken>	cpToken;
	CComPtr <IEnumSpObjectTokens>	cpEnum;

	//Create a SAPI voice
	hr = cpVoice.CoCreateInstance( CLSID_SpVoice );
	
	//Enumerate voice tokens with attribute "Name=Microsoft Sam” 
	if(SUCCEEDED(hr))
	{
		hr = SpEnumTokens(SPCAT_VOICES, L"Name=Microsoft Sam", NULL, &cpEnum;);
	}
    
	//Get the closest token
	if(SUCCEEDED(hr))
	{
		hr = cpEnum ->Next(1, &cpToken;, NULL);
	}
	
	//set the voice 
	if(SUCCEEDED(hr))
	{
		hr = cpVoice->SetVoice( cpToken);
	}

	//set the output to the default audio device
	if(SUCCEEDED(hr))
	{
		hr = cpVoice->SetOutput( NULL, TRUE );
	}

	//Speak the text file (assumed to exist)
	if(SUCCEEDED(hr))
	{
		hr = cpVoice->Speak( L”c:\\ttstemp.txt”,  SPF_IS_FILENAME, NULL );
	}	

	//Release objects
	cpVoice.Release ();
	cpEnum.Release();
	cpToken.Release();

Speak a Text File in Automation

The following code illustrates how to speak a text file in a specific voice in Visual Basic. This example assumes a text file (ttstemp.txt) containing the text to be spoken already exists. ISpeechVoice.SpeakStream is used here to speak an SpFileStream that has been bound to the file.


Dim FileName As String
Dim FileStream As New SpFileStream
Dim Voice As  SpVoice

'Create SAPI voice
Set Voice = New SpVoice
	
'Assume that ttstemp.txt exists
FileName = "c:\ttstemp.txt"
	
'Open the text file
FileStream.Open FileName, SSFMOpenForRead, True

'Select Microsoft Sam voice
Set Voice.voice = voice.GetVoices("Name=Microsoft Sam", "Language=409").Item(0)

'Speak the file stream
Voice.SpeakStream FileStream

'Close the Stream
FileStream.Close

'Release the objects
Set FileStream = Nothing
Set Voice = Nothing