TTS Engine Characteristics
Engines use the three characteristics of Volume, Pitch, and Rate to partially define speech traits. At the application level, setting these values is simple; you need only set them to a given number. However, implementation of these traits is more complex for the engine.
Volume
At the application level, volume is a number from zero to 100 where 100 is the maximum value for a voice. It is a linear progression and a value of 50 represents half of the loudest permitted. The increments should be the range divided by 100.
Pitch adjustment
The value can range from -10 to +10. A value of zero sets a voice to speak at its default pitch. A value of -10 sets a voice to speak at three-fourths of its default pitch. A value of +10 sets a voice to speak at four-thirds of its default pitch. Each increment between -10 and +10 is logarithmically distributed such that incrementing or decrementing by 1 is multiplying or dividing the pitch by the 24th root of 2 (about 1.03). Values outside of the -10 and +10 range will be passed to an engine. However, SAPI 5-compliant engines may not support such extremes and may clip the pitch to the maximum or minimum the engine supports. Values of -24 and +24 must lower and raise pitch by 1 octave respectively. All incrementing or decrementing by 1 must multiply or divide the pitch by the 24th root of 2.
Rate adjustment
The value can range from -10 to +10. A value of zero sets a voice to speak at its default rate. A value of -10 sets a voice to speak at one-third of its default rate. A value of +10 sets a voice to speak at three times its default rate. Each increment between -10 and +10 is logarithmically distributed such that incrementing or decrementing by 1 is multiplying or dividing the rate by the 10th root of 3 (about 1.1). Values more extreme than -10 and +10 will be passed to an engine. However, SAPI 5-compliant engines may not support such extremes and may clip the rate to the maximum or minimum rate the engine supports.