SpVoice Phoneme event (Microsoft Speech Platform)

Microsoft Speech Platform SDK 11

previous page next page

Microsoft Speech Platform

Phoneme Event

The Phoneme event occurs when the text-to-speech (TTS) engine detects a phoneme boundary while speaking a stream for the SpVoice object.

SpVoice.Phoneme(
     StreamNumber As Long,
     StreamPosition As Variant,
     Duration As Long,
     NextPhoneId As Integer,
     Feature As SpeechVisemeFeature,
     CurrentPhoneId As Integer
)

Parameters

StreamNumber: The stream number which generated the event. When a voice enqueues more than one stream by speaking asynchronously, the stream number is necessary to associate an event with the appropriate stream.
StreamPosition: The character position in the output stream at which the phoneme begins.
Duration: The duration of the phoneme, in milliseconds.
NextPhoneId: The next phone ID.
Feature: The SpeechVisemeFeature, which may indicate emphasis or stress on the viseme.
CurrentPhoneId: The current phone ID.

Remarks

Depending on the language, a TTS engine may return a SAPI ID or a Universal Phone Set (UPS) ID when it encounters a phoneme. To determine which phone set (phonetic alphabet) that a synthesis engine is using to create pronunciations, use the ISpPhoneticAlphabetSelection::IsAlphabetUPS method.

When the engine synthesizes a phoneme comprised of more than one phoneme element, it raises an event for each element. For example, when a Japanese TTS engine speaks the phoneme "KYA," which is comprised of the phoneme elements "KI" and "XYA," it raises an SPEI_PHONEME event for each element. Because the element "KI" in this case modifies the sound of the element following it, rather than initiating a sound, the duration of its SPEI_PHONEME event is zero.

Example

The following Visual Basic form code demonstrates the Phoneme event. To run this code, create a form with the following controls:

A command button called Command1
Two text boxes called Text1 and Text2

Paste this code into the Declarations section of the form.

The Form_Load procedure puts a text string in Text1 and creates a voice object, leaving all its properties with their default settings. The command1_Click procedure calls the Speak method. This will cause the TTS engine to send the Phoneme event to the voice; the Phoneme event code will display the phoneme values in Text2.

Option Explicit

Public WithEvents vox As SpeechLib.SpVoice

Private Sub Command1_Click()

    vox.Speak Text1.Text, SVSFlagsAsync

End Sub

Private Sub Form_Load()

    Set vox = New SpVoice
    Text1.Text = "This is text in a text box."

End Sub

Private Sub vox_Phoneme(ByVal StreamNumber As Long, ByVal StreamPosition As Variant, ByVal Duration As Long, ByVal NextPhoneId As Integer, ByVal Feature As SpeechLib.SpeechVisemeFeature, ByVal CurrentPhoneId As Integer)

    Text2.Text = Text2.Text & CurrentPhoneId & " "

End Sub

previous page start next page