Speech Synthesis (Microsoft.Speech)

Microsoft Speech Platform SDK 11

Collapse image Expand Image Copy image CopyHover image

Speech synthesis, a technology also known as Text-to-Speech (TTS), converts textual information into synthetic speech output. Using the speech synthesis functionality in Microsoft Speech Platform SDK 11, you can generate speech from applications running on Windows Server.

Create SSML Content

The text that is converted to synthetic speech is usually called a prompt. You can author the content for prompts programmatically, or use the XML format that conforms to the Speech Synthesis Markup Language (SSML) Version 1.0. The PromptBuilder class contains methods for building prompt content from strings, SSML markup, and audio files. Speech synthesis content in XML format must be added to a PromptBuilder object before it can be spoken by the TTS engine.

Control Speech Output

You can control the characteristics of synthesized speech, for example you can select a speaking voice, change the current speaking voice, and specify voice characteristics such as language, age, and gender. You can also control attributes of speech such as volume, speaking rate, and emphasis.

Note Note

A speech synthesis engine can render speech in only one language; a different engine is required for each language. The Microsoft Speech Platform Runtime 11 and Speech Platform SDK 11 do not include any engines for speech synthesis in a specific language. You must download a language pack (an engine for speech synthesis in a specific language) for each language in which you want to generate synthesized speech. See InstalledVoice for more information.

In This Section