SPEVENTENUM

Microsoft Speech SDK

previous page next page

Microsoft Speech SDK

SAPI 5.1

SPEVENTENUM

SPEVENTENUM lists the events possible from SAPI.

It is recommended that developers use the helper class CSpEvent to easily and clearly decode events.

typedef enum SPEVENTENUM
{
    SPEI_UNDEFINED,
  
    //--- TTS engine
    SPEI_START_INPUT_STREAM,
    SPEI_END_INPUT_STREAM,
    SPEI_VOICE_CHANGE,
    SPEI_TTS_BOOKMARK,
    SPEI_WORD_BOUNDARY,
    SPEI_PHONEME,
    SPEI_SENTENCE_BOUNDARY,
    SPEI_VISEME,
    SPEI_TTS_AUDIO_LEVEL,

    //--- Engine vendors use these reserved bits
    SPEI_TTS_PRIVATE,

    SPEI_MIN_TTS,
    SPEI_MAX_TTS,

    //--- Speech Recognition
    SPEI_END_SR_STREAM,
    SPEI_SOUND_START,
    SPEI_SOUND_END,
    SPEI_PHRASE_START,
    SPEI_RECOGNITION,
    SPEI_HYPOTHESIS,
    SPEI_SR_BOOKMARK,
    SPEI_PROPERTY_NUM_CHANGE,
    SPEI_PROPERTY_STRING_CHANGE,
    SPEI_FALSE_RECOGNITION,
    SPEI_INTERFERENCE,
    SPEI_REQUEST_UI,
    SPEI_RECO_STATE_CHANGE,
    SPEI_ADAPTATION,
    SPEI_START_SR_STREAM,
    SPEI_RECO_OTHER_CONTEXT,
    SPEI_SR_AUDIO_LEVEL,

    //--- Engine vendors use these reserved bits
    SPEI_SR_PRIVATE,

    SPEI_MIN_SR,
    SPEI_MAX_SR,

    SPEI_RESERVED1,
    SPEI_RESERVED2,
    SPEI_RESERVED3
} SPEVENTENUM;

Elements

SPEI_START_INPUT_STREAM: The input stream (text or audio) from a Speak or SpeakStream call has begun synthesizing to the output. The event is fired by SAPI.
SPEI_END_INPUT_STREAM: The input stream (text or audio) from a Speak or SpeakStream call has finished synthesizing to the output. The event is fired by SAPI.
SPEI_VOICE_CHANGE: SAPI fires this event for voice changes within a single input stream of a Speak call. wParam is either zero or the SPF_PERSIST_XML. If the current speak call takes SPF_PERSIST_XML, wparam is SPF_PERSIST_XML. Otherwise, zero. lParam is the current voice object token. elParamType has to be SPET_LPARAM_IS_TOKEN.
SPEI_TTS_BOOKMARK: The bookmark element is used to insert a bookmark into the input stream. If an application specifies interest in bookmark events, it will receive the bookmark events during synthesis. wParam is the current bookmark name (in base 10) converted to a long integer. If name of current bookmark is not an integer, wParam will be zero. lParam is the bookmark string. elParamType has to be SPET_LPARAM_IS_STRING.
SPEI_WORD_BOUNDARY: A word is beginning to synthesize. Markup language (XML) markers are counted in the boundaries and offsets. wParam is the character length of the word in the current input stream being synthesized. lParam is the character position within the current text input stream of the word being synthesized.
SPEI_PHONEME: Phoneme was returned by the TTS engine. The high word of wParam is the duration, in milliseconds, of the current phoneme element. The low word is the id of the next phoneme element. The high word of lparam is the phoneme element feature defined in SPVFEATURE. This value will be zero if the current phoneme element is not a primary stress or emphasis. The low word of lParam is the id for the current phoneme element being synthesized.

When the engine synthesizes a phoneme comprised of more than one phoneme element, it raises an event for each element. For example, when a Japanese TTS engine speaks the phoneme "KYA," which is comprised of the phoneme elements "KI" and "XYA," it raises an SPEI_PHONEME event for each element. Because the element "KI" in this case modifies the sound of the element following it, rather than initiating a sound, the duration of its SPEI_PHONEME event is zero.
SPEI_SENTENCE_BOUNDARY: A sentence is beginning to synthesize. wParam is the character length of the sentence including punctuation in the current input stream being synthesized. lParam is the character position within the current text input stream of the sentence being synthesized.
SPEI_VISEME: Viseme was determined by synthesis engine. The high word of wParam is the duration, in milliseconds, of the current viseme. The low word is for the next viseme of type SPVISEMES. The high word of lParam is the viseme feature defined in SPVFEATURE. This value will be zero if the current viseme is not primary stress or emphasis. The low word of lParam is the current viseme being synthesized.
SPEI_TTS_AUDIO_LEVEL: This event is fired by SAPI. lParam is 0, and wParam is the current audio level from zero to 100.
SPEI_TTS_PRIVATE: Reserved for private/internal use by the TTS Engine.
SPEI_MIN_TTS: Minimum event enumeration value for TTS events.
SPEI_MAX_TTS: Maximum event enumeration value for TTS events.
SPEI_END_SR_STREAM: The SR engine has finished receiving an audio input stream. LPARAM points to the SR engine's final HRESULT code (see CSpEvent::EndStreamResult). WPARAM points to a Boolean value signifying whether the audio input stream object was released (see CSpEvent::InputStreamReleased).
SPEI_SOUND_START: The SR engine determined that audible sound is available through the input stream.
SPEI_SOUND_END: The SR engine has determined that audible sound is no longer available through the input stream, or that the sound stream has been inactive for a period.
SPEI_PHRASE_START: The SR engine is starting to recognize a phrase. Note that this MUST be followed by either an SPEI_FALSE_RECOGNITION or SPEI_RECOGNITION event.
SPEI_RECOGNITION: The SR engine is returning a full recognition - its best guess at a text representation of the audio data. LParam is a pointer to an ISpRecoResult object (see CSpEvent::RecoResult).
SPEI_HYPOTHESIS: The SR engine is returning a partial phrase recognition - effectively its best guess up to that point in the stream. LParam is a pointer to an ISpRecoResult object (see CSpEvent::RecoResult).
SPEI_SR_BOOKMARK: A Bookmark event is returned when the SR engine has processed to the stream position of a bookmark. lParam is an application specified value set using ISpRecoContext::Bookmark. wParam is SPREF_AutoPause if ISpRecoContext::Bookmark was called with SPBO_PAUSE, and NULL otherwise.
SPEI_PROPERTY_NUM_CHANGE: An SR engine supported property was changed. LPARAM is a string pointer to the property name that changed (see CSpEvent::PropertyName]. WPARAM contains the new value (see CSpEvent::PropertyNumValue).
SPEI_PROPERTY_STRING_CHANGE: LPARAM is a string pointer to the property name that changed (see CSpEvent::PropertyName). Immediately following the NULL-termination of the property name is the new property value (see CSpEvent::PropertyStringValue).
SPEI_FALSE_RECOGNITION: Apparent speech without valid recognition. An SR engine can optionally return a result object, which will be referenced by the LPARAM member (see CSpEvent::RecoResult).
SPEI_INTERFERENCE: The SR engine determined that the sound stream has a hindrance and is preventing a successful recognition. lParam is any combination of SPINTERFERENCE flags (See CSpEvent::Interference).
SPEI_REQUEST_UI: The SR engine's request to display a specific user interface. LPARAM is a null-terminated string (see CSpEvent::RequestTypeOfUI).
SPEI_RECO_STATE_CHANGE: The recognizer state has changed. WPARAM is the new recognizer state (see SPRECOSTATE and CSpEvent::RecoState).
SPEI_ADAPTATION: The SR engine is ready to process the adaptation buffer.
SPEI_START_SR_STREAM: The SR engine has reached the start of a new audio stream.
SPEI_SR_AUDIO_LEVEL: The audio input stream object fires this event. wParam is the current audio level from zero to 100.
SPEI_RECO_OTHER_CONTEXT: A recognition was sent to another context.
SPEI_SR_PRIVATE: Reserved for private/internal use by the SR engine.
SPEI_MIN_SR: Minimum event enumeration value for speech recognition events.
SPEI_MAX_SR: Maximum event enumeration value for speech recognition events.
SPEI_RESERVED1: Reserved for SAPI internal use. See SPFEI Remarks section.
SPEI_RESERVED2: Reserved for SAPI internal use. See SPFEI Remarks section.
SPEI_RESERVED3: Reserved for future use, do not use.

previous page start next page