SpVoice Phoneme event (Microsoft Speech Platform)

Microsoft Speech Platform SDK 11

Microsoft Speech Platform

Object: SpVoice (Events)

Phoneme Event

The Phoneme event occurs when the text-to-speech (TTS) engine detects a phoneme boundary while speaking a stream for the SpVoice object.


SpVoice.Phoneme(
     StreamNumber As Long,
     StreamPosition As Variant,
     Duration As Long,
     NextPhoneId As Integer,
     Feature As SpeechVisemeFeature,
     CurrentPhoneId As Integer
)

Parameters

StreamNumber
The stream number which generated the event. When a voice enqueues more than one stream by speaking asynchronously, the stream number is necessary to associate an event with the appropriate stream.
StreamPosition
The character position in the output stream at which the phoneme begins.
Duration
The duration of the phoneme, in milliseconds.
NextPhoneId
The next phone ID.
Feature
The SpeechVisemeFeature, which may indicate emphasis or stress on the viseme.
CurrentPhoneId
The current phone ID.

Remarks

Depending on the language, a TTS engine may return a SAPI ID or a Universal Phone Set (UPS) ID when it encounters a phoneme. To determine which phone set (phonetic alphabet) that a synthesis engine is using to create pronunciations, use the ISpPhoneticAlphabetSelection::IsAlphabetUPS method.

When the engine synthesizes a phoneme comprised of more than one phoneme element, it raises an event for each element. For example, when a Japanese TTS engine speaks the phoneme "KYA," which is comprised of the phoneme elements "KI" and "XYA," it raises an SPEI_PHONEME event for each element. Because the element "KI" in this case modifies the sound of the element following it, rather than initiating a sound, the duration of its SPEI_PHONEME event is zero.


Example

The following Visual Basic form code demonstrates the Phoneme event. To run this code, create a form with the following controls:

  • A command button called Command1
  • Two text boxes called Text1 and Text2

Paste this code into the Declarations section of the form.

The Form_Load procedure puts a text string in Text1 and creates a voice object, leaving all its properties with their default settings. The command1_Click procedure calls the Speak method. This will cause the TTS engine to send the Phoneme event to the voice; the Phoneme event code will display the phoneme values in Text2.


Option Explicit

Public WithEvents vox As SpeechLib.SpVoice

Private Sub Command1_Click()

    vox.Speak Text1.Text, SVSFlagsAsync

End Sub

Private Sub Form_Load()

    Set vox = New SpVoice
    Text1.Text = "This is text in a text box."

End Sub

Private Sub vox_Phoneme(ByVal StreamNumber As Long, ByVal StreamPosition As Variant, ByVal Duration As Long, ByVal NextPhoneId As Integer, ByVal Feature As SpeechLib.SpeechVisemeFeature, ByVal CurrentPhoneId As Integer)

    Text2.Text = Text2.Text & CurrentPhoneId & " "

End Sub