Text synthesis

Microsoft Speech SDK

The Microsoft.com Speech website Microsoft Speech SDK SAPI 5.1

Text synthesis

SAPI 5 uses the Extensible Markup Language (XML) to define text synthesis characteristics and application configuration settings.

A text-to-speech (TTS) engine that uses synthesis generates sounds similar to those created by the human voice and applies various filters to simulate throat length, mouth cavity, lip shape, and tongue position. Although the voice produced through text synthesis often sounds less human than a voice produced by diphone concatenation, it is possible to obtain different qualities of voice through modifying TTS configuration settings. SAPI 5-compliant TTS engines can achieve improved synthesized text-to-speech voice qualities using XML to control the configuration settings for text synthesis.

The following section covers: