SPEVENTENUM

Microsoft Speech SDK

The Microsoft.com Speech website Microsoft Speech SDK

SAPI 5.1

SPEVENTENUM

SPEVENTENUM lists the events possible from SAPI.

It is recommended that developers use the helper class CSpEvent to easily and clearly decode events.

typedef enum SPEVENTENUM
{
    SPEI_UNDEFINED,
  
    //--- TTS engine
    SPEI_START_INPUT_STREAM,
    SPEI_END_INPUT_STREAM,
    SPEI_VOICE_CHANGE,
    SPEI_TTS_BOOKMARK,
    SPEI_WORD_BOUNDARY,
    SPEI_PHONEME,
    SPEI_SENTENCE_BOUNDARY,
    SPEI_VISEME,
    SPEI_TTS_AUDIO_LEVEL,

    //--- Engine vendors use these reserved bits
    SPEI_TTS_PRIVATE,

    SPEI_MIN_TTS,
    SPEI_MAX_TTS,

    //--- Speech Recognition
    SPEI_END_SR_STREAM,
    SPEI_SOUND_START,
    SPEI_SOUND_END,
    SPEI_PHRASE_START,
    SPEI_RECOGNITION,
    SPEI_HYPOTHESIS,
    SPEI_SR_BOOKMARK,
    SPEI_PROPERTY_NUM_CHANGE,
    SPEI_PROPERTY_STRING_CHANGE,
    SPEI_FALSE_RECOGNITION,
    SPEI_INTERFERENCE,
    SPEI_REQUEST_UI,
    SPEI_RECO_STATE_CHANGE,
    SPEI_ADAPTATION,
    SPEI_START_SR_STREAM,
    SPEI_RECO_OTHER_CONTEXT,
    SPEI_SR_AUDIO_LEVEL,

    //--- Engine vendors use these reserved bits
    SPEI_SR_PRIVATE,

    SPEI_MIN_SR,
    SPEI_MAX_SR,

    SPEI_RESERVED1,
    SPEI_RESERVED2,
    SPEI_RESERVED3
} SPEVENTENUM;

Elements

SPEI_START_INPUT_STREAM
The input stream (text or audio) from a Speak or SpeakStream call has begun synthesizing to the output. The event is fired by SAPI.
SPEI_END_INPUT_STREAM
The input stream (text or audio) from a Speak or SpeakStream call has finished synthesizing to the output. The event is fired by SAPI.
SPEI_VOICE_CHANGE
SAPI fires this event for voice changes within a single input stream of a Speak call. wParam is either zero or the SPF_PERSIST_XML. If the current speak call takes SPF_PERSIST_XML, wparam is SPF_PERSIST_XML. Otherwise, zero. lParam is the current voice object token. elParamType has to be SPET_LPARAM_IS_TOKEN.
SPEI_TTS_BOOKMARK
The bookmark element is used to insert a bookmark into the input stream. If an application specifies interest in bookmark events, it will receive the bookmark events during synthesis. wParam is the current bookmark name (in base 10) converted to a long integer. If name of current bookmark is not an integer, wParam will be zero. lParam is the bookmark string. elParamType has to be SPET_LPARAM_IS_STRING.
SPEI_WORD_BOUNDARY
A word is beginning to synthesize. Markup language (XML) markers are counted in the boundaries and offsets. wParam is the character length of the word in the current input stream being synthesized. lParam is the character position within the current text input stream of the word being synthesized.
SPEI_PHONEME
Phoneme was returned by the TTS engine. The high word of wParam is the duration, in milliseconds, of the current phoneme element. The low word is the id of the next phoneme element. The high word of lparam is the phoneme element feature defined in SPVFEATURE. This value will be zero if the current phoneme element is not a primary stress or emphasis. The low word of lParam is the id for the current phoneme element being synthesized.

When the engine synthesizes a phoneme comprised of more than one phoneme element, it raises an event for each element. For example, when a Japanese TTS engine speaks the phoneme "KYA," which is comprised of the phoneme elements "KI" and "XYA," it raises an SPEI_PHONEME event for each element. Because the element "KI" in this case modifies the sound of the element following it, rather than initiating a sound, the duration of its SPEI_PHONEME event is zero.
SPEI_SENTENCE_BOUNDARY
A sentence is beginning to synthesize. wParam is the character length of the sentence including punctuation in the current input stream being synthesized. lParam is the character position within the current text input stream of the sentence being synthesized.
SPEI_VISEME
Viseme was determined by synthesis engine. The high word of wParam is the duration, in milliseconds, of the current viseme. The low word is for the next viseme of type SPVISEMES. The high word of lParam is the viseme feature defined in SPVFEATURE. This value will be zero if the current viseme is not primary stress or emphasis. The low word of lParam is the current viseme being synthesized.
SPEI_TTS_AUDIO_LEVEL
This event is fired by SAPI. lParam is 0, and wParam is the current audio level from zero to 100.
SPEI_TTS_PRIVATE
Reserved for private/internal use by the TTS Engine.
SPEI_MIN_TTS
Minimum event enumeration value for TTS events.
SPEI_MAX_TTS
Maximum event enumeration value for TTS events.
SPEI_END_SR_STREAM
The SR engine has finished receiving an audio input stream. LPARAM points to the SR engine's final HRESULT code (see CSpEvent::EndStreamResult). WPARAM points to a Boolean value signifying whether the audio input stream object was released (see CSpEvent::InputStreamReleased).
SPEI_SOUND_START
The SR engine determined that audible sound is available through the input stream.
SPEI_SOUND_END
The SR engine has determined that audible sound is no longer available through the input stream, or that the sound stream has been inactive for a period.
SPEI_PHRASE_START
The SR engine is starting to recognize a phrase. Note that this MUST be followed by either an SPEI_FALSE_RECOGNITION or SPEI_RECOGNITION event.
SPEI_RECOGNITION
The SR engine is returning a full recognition - its best guess at a text representation of the audio data. LParam is a pointer to an ISpRecoResult object (see CSpEvent::RecoResult).
SPEI_HYPOTHESIS
The SR engine is returning a partial phrase recognition - effectively its best guess up to that point in the stream. LParam is a pointer to an ISpRecoResult object (see CSpEvent::RecoResult).
SPEI_SR_BOOKMARK
A Bookmark event is returned when the SR engine has processed to the stream position of a bookmark. lParam is an application specified value set using ISpRecoContext::Bookmark. wParam is SPREF_AutoPause if ISpRecoContext::Bookmark was called with SPBO_PAUSE, and NULL otherwise.
SPEI_PROPERTY_NUM_CHANGE
An SR engine supported property was changed. LPARAM is a string pointer to the property name that changed (see CSpEvent::PropertyName]. WPARAM contains the new value (see CSpEvent::PropertyNumValue).
SPEI_PROPERTY_STRING_CHANGE
LPARAM is a string pointer to the property name that changed (see CSpEvent::PropertyName). Immediately following the NULL-termination of the property name is the new property value (see CSpEvent::PropertyStringValue).
SPEI_FALSE_RECOGNITION
Apparent speech without valid recognition. An SR engine can optionally return a result object, which will be referenced by the LPARAM member (see CSpEvent::RecoResult).
SPEI_INTERFERENCE
The SR engine determined that the sound stream has a hindrance and is preventing a successful recognition. lParam is any combination of SPINTERFERENCE flags (See CSpEvent::Interference).
SPEI_REQUEST_UI
The SR engine's request to display a specific user interface. LPARAM is a null-terminated string (see CSpEvent::RequestTypeOfUI).
SPEI_RECO_STATE_CHANGE
The recognizer state has changed. WPARAM is the new recognizer state (see SPRECOSTATE and CSpEvent::RecoState).
SPEI_ADAPTATION
The SR engine is ready to process the adaptation buffer.
SPEI_START_SR_STREAM
The SR engine has reached the start of a new audio stream.
SPEI_SR_AUDIO_LEVEL
The audio input stream object fires this event. wParam is the current audio level from zero to 100.
SPEI_RECO_OTHER_CONTEXT
A recognition was sent to another context.
SPEI_SR_PRIVATE
Reserved for private/internal use by the SR engine.
SPEI_MIN_SR
Minimum event enumeration value for speech recognition events.
SPEI_MAX_SR
Maximum event enumeration value for speech recognition events.
SPEI_RESERVED1
Reserved for SAPI internal use. See SPFEI Remarks section.
SPEI_RESERVED2
Reserved for SAPI internal use. See SPFEI Remarks section.
SPEI_RESERVED3
Reserved for future use, do not use.