Text synthesis (Microsoft Speech Platform)

Microsoft Speech Platform SDK 11

previous page next page

Microsoft Speech Platform

Text synthesis

SAPI 5 uses the Extensible Markup Language (XML) to define text synthesis characteristics and application configuration settings.

A text-to-speech (TTS) engine that uses synthesis generates sounds similar to those created by the human voice and applies various filters to simulate throat length, mouth cavity, lip shape, and tongue position. Although the voice produced through text synthesis often sounds less human than a voice produced by diphone concatenation, it is possible to obtain different qualities of voice through modifying TTS configuration settings. SAPI 5-compliant TTS engines can achieve improved synthesized text-to-speech voice qualities using XML to control the configuration settings for text synthesis.

The following section covers:

Synthesis markup
English Context tag definitions
Chinese Context tag definitions
Japanese Context tag definitions

previous page start next page