Automation Interfaces and Objects (Microsoft Speech Platform)

Microsoft Speech Platform SDK 11

previous page next page

Microsoft Speech Platform

Automation Interfaces and Objects

The Automation Interfaces provide object-oriented access to the speech recognition and text-to-speech capabilities of SAPI.

Please note that all automation interface names begin with "ISpeech" and that all automation object names begin with "Sp." Applications can explicitly create object variables which instantiate automation objects, using the "CreateObject" statement or the "New" keyword in a "Dim" or "Set" statement. Object variables which instantiate automation interfaces, on the other hand, are only created by the methods, properties and events of automation objects.

Additionally, some automation interfaces are implemented by automation objects, and the properties and methods of those interfaces are inherited by the objects. For example, the ISpeechBaseStream interface defines a set of properties and methods for storing and manipulating audio data in memory. The SpFileStream, SpMemoryStream and SpCustomStream objects implement the ISpeechBaseStream interface; as a result, the methods and properties of the ISpeechBaseStream interface are available in all three objects.

Automation Interfaces and Objects

SAPI 5.1 Automation consists of the following interfaces and objects:

Interfaces	Description
ISpeechAudio	Supports the control of real-time audio streams, such as those connected to a live microphone or telephone line.
ISpeechAudioBufferInfo	Defines the audio stream buffer information.
ISpeechAudioStatus	Provides control over the operation of real-time audio streams.
ISpeechBaseStream	Defines properties and methods common to all audio stream objects.
ISpeechDataKey	Provides access to the speech configuration database.
ISpeechGrammarRule	Defines the properties and methods of a speech grammar rule.
ISpeechGrammarRules	Represents a collection of ISpeechGrammarRule objects.
ISpeechGrammarRuleState	Presents the properties and methods of a speech grammar rule state.
ISpeechGrammarRuleStateTransition	Returns data about a transition from one rule state to another, or from a rule state to the end of a rule.
ISpeechGrammarRuleStateTransitions	Represents a collection of ISpeechGrammarRuleStateTransition objects.
ISpeechLexiconPronunciation	Provides access to the pronunciations of a speech lexicon word.
ISpeechLexiconPronunciations	Represents a collection of ISpeechLexiconPronunciation objects.
ISpeechLexiconWord	Provides access to a speech lexicon word.
ISpeechLexiconWords	Represents a collection of ISpeechLexiconWord objects.
ISpeechObjectTokens	Represents a collection of SpObjectToken objects.
ISpeechPhraseAlternate	Enables applications to retrieve alternate phrase information from an SR engine, and to update the SR engine's language model to reflect committed alternate changes.
ISpeechPhraseAlternates	Represents a collection of ISpeechPhraseAlternate objects.
ISpeechPhraseElement	Provides access to information about a word or phrase.
ISpeechPhraseElements	Represents a collection of ISpeechPhraseElement objects.
ISpeechPhraseInfo	Contains properties detailing phrase elements.
ISpeechPhraseProperties	Represents a collection of ISpeechPhraseProperty objects.
ISpeechPhraseProperty	Stores the information for a semantic property.
ISpeechPhraseReplacement	Specifies a replacement, or text normalization, of one or more spoken words.
ISpeechPhraseReplacements	Represents a collection of ISpeechPhraseElement objects.
ISpeechPhraseRule	Contains information about a speech phrase rule.
ISpeechPhraseRules	Represents a collection of ISpeechPhraseRule objects.
ISpeechRecognizerStatus	Returns the status of the speech recognition engine represented by the recognizer object.
ISpeechRecoGrammar	Enables applications to manage the words and phrases for the SR engine.
ISpeechRecoResult	Returns information about the recognition engine's hypotheses, recognitions, and false recognitions.
ISpeechRecoResultDispatch	Cannot be QI'd for but allows IDispatch access to both ISpeechRecoResult and ISpeechXMLRecoResult.
ISpeechRecoResultTimes	Contains the time information for speech recognition results.
ISpeechVoiceStatus	Contains status information about an SpVoice object.
ISpeechXMLRecoResult	Is used to acquire the semantic results of speech recognition and return them as an SML document.

Objects	Description
SpAudioFormat	Defines an audio format.
SpCustomStream	Supports supports the use of existing IStream objects in SAPI.
SpFileStream	Provides the ability to open files as audio streams and save audio streams as files.
SpInProcRecoContext	Defines a recognition context, or a collection of settings, that requests a specific type of recognition as determined by the needs of an application.
SpInProcRecoContext (Events)	Defines the types of events that a recognition context can receive.
SpInProcRecognizer	Represents a speech recognition engine.
SpLexicon	Provides access to lexicons, which contain information about words that can be recognized or spoken.
SpMemoryStream	Supports audio stream operations in memory.
SpMMAudioIn	Represents the audio implementation for the standard Windows wave-in multimedia layer.
SpMMAudioOut	Represents the audio implementation for the standard Windows wave-out multimedia layer.
SpObjectToken	Supports object token entries.
SpObjectTokenCategory	Represents a class of object tokens.
SpPhoneConverter	Supports conversion from the SAPI character phoneset to the Id phoneset.
SpPhraseInfoBuilder	Provides the ability to rebuild phrase information from audio data saved to memory.
SpSharedRecoContext	Defines a recognition context, or a collection of settings, that requests a specific type of recognition as determined by the needs of an application.
SpSharedRecoContext (Events)	Defines the types of events that a recognition context can receive.
SpSharedRecognizer	Represents a speech recognition engine.
SpTextSelectionInformation	Provides access to the text selection information pertaining to a word sequence buffer.
SpUnCompressedLexicon	Provides access to lexicons, which contain information about words that can be recognized or spoken.
SpVoice	Enables an application to perform text synthesis operations.
SpVoice (Events)	defines the types of events that can be received by an SpVoice object.
SpWaveFormatEx	Defines the format of waveform-audio data.

previous page start next page