Automation Interfaces and Objects
The Automation Interfaces provide object-oriented access to the speech recognition and text-to-speech capabilities of SAPI.
Please note that all automation interface names begin with "ISpeech" and that all automation object names begin with "Sp." Applications can explicitly create object variables which instantiate automation objects, using the "CreateObject" statement or the "New" keyword in a "Dim" or "Set" statement. Object variables which instantiate automation interfaces, on the other hand, are only created by the methods, properties and events of automation objects.
Additionally, some automation interfaces are implemented by automation objects, and the properties and methods of those interfaces are inherited by the objects. For example, the ISpeechBaseStream interface defines a set of properties and methods for storing and manipulating audio data in memory. The SpFileStream, SpMemoryStream and SpCustomStream objects implement the ISpeechBaseStream interface; as a result, the methods and properties of the ISpeechBaseStream interface are available in all three objects.
Automation Interfaces and Objects
SAPI 5.1 Automation consists of the following interfaces and objects:
Interfaces | Description |
---|---|
ISpeechAudio | Supports the control of real-time audio streams, such as those connected to a live microphone or telephone line. |
ISpeechAudioBufferInfo | Defines the audio stream buffer information. |
ISpeechAudioStatus | Provides control over the operation of real-time audio streams. |
ISpeechBaseStream | Defines properties and methods common to all audio stream objects. |
ISpeechDataKey | Provides access to the speech configuration database. |
ISpeechGrammarRule | Defines the properties and methods of a speech grammar rule. |
ISpeechGrammarRules | Represents a collection of ISpeechGrammarRule objects. |
ISpeechGrammarRuleState | Presents the properties and methods of a speech grammar rule state. |
ISpeechGrammarRuleStateTransition | Returns data about a transition from one rule state to another, or from a rule state to the end of a rule. |
ISpeechGrammarRuleStateTransitions | Represents a collection of ISpeechGrammarRuleStateTransition objects. |
ISpeechLexiconPronunciation | Provides access to the pronunciations of a speech lexicon word. |
ISpeechLexiconPronunciations | Represents a collection of ISpeechLexiconPronunciation objects. |
ISpeechLexiconWord | Provides access to a speech lexicon word. |
ISpeechLexiconWords | Represents a collection of ISpeechLexiconWord objects. |
ISpeechObjectTokens | Represents a collection of SpObjectToken objects. |
ISpeechPhraseAlternate | Enables applications to retrieve alternate phrase information from an SR engine, and to update the SR engine's language model to reflect committed alternate changes. |
ISpeechPhraseAlternates | Represents a collection of ISpeechPhraseAlternate objects. |
ISpeechPhraseElement | Provides access to information about a word or phrase. |
ISpeechPhraseElements | Represents a collection of ISpeechPhraseElement objects. |
ISpeechPhraseInfo | Contains properties detailing phrase elements. |
ISpeechPhraseProperties | Represents a collection of ISpeechPhraseProperty objects. |
ISpeechPhraseProperty | Stores the information for a semantic property. |
ISpeechPhraseReplacement | Specifies a replacement, or text normalization, of one or more spoken words. |
ISpeechPhraseReplacements | Represents a collection of ISpeechPhraseElement objects. |
ISpeechPhraseRule | Contains information about a speech phrase rule. |
ISpeechPhraseRules | Represents a collection of ISpeechPhraseRule objects. |
ISpeechRecognizerStatus | Returns the status of the speech recognition engine represented by the recognizer object. |
ISpeechRecoGrammar | Enables applications to manage the words and phrases for the SR engine. |
ISpeechRecoResult | Returns information about the recognition engine's hypotheses, recognitions, and false recognitions. |
ISpeechRecoResultDispatch | Cannot be QI'd for but allows IDispatch access to both ISpeechRecoResult and ISpeechXMLRecoResult. |
ISpeechRecoResultTimes | Contains the time information for speech recognition results. |
ISpeechVoiceStatus | Contains status information about an SpVoice object. |
ISpeechXMLRecoResult | Is used to acquire the semantic results of speech recognition and return them as an SML document. |
Objects | Description |
---|---|
SpAudioFormat | Defines an audio format. |
SpCustomStream | Supports supports the use of existing IStream objects in SAPI. |
SpFileStream | Provides the ability to open files as audio streams and save audio streams as files. |
SpInProcRecoContext | Defines a recognition context, or a collection of settings, that requests a specific type of recognition as determined by the needs of an application. |
SpInProcRecoContext (Events) | Defines the types of events that a recognition context can receive. |
SpInProcRecognizer | Represents a speech recognition engine. |
SpLexicon | Provides access to lexicons, which contain information about words that can be recognized or spoken. |
SpMemoryStream | Supports audio stream operations in memory. |
SpMMAudioIn | Represents the audio implementation for the standard Windows wave-in multimedia layer. |
SpMMAudioOut | Represents the audio implementation for the standard Windows wave-out multimedia layer. |
SpObjectToken | Supports object token entries. |
SpObjectTokenCategory | Represents a class of object tokens. |
SpPhoneConverter | Supports conversion from the SAPI character phoneset to the Id phoneset. |
SpPhraseInfoBuilder | Provides the ability to rebuild phrase information from audio data saved to memory. |
SpSharedRecoContext | Defines a recognition context, or a collection of settings, that requests a specific type of recognition as determined by the needs of an application. |
SpSharedRecoContext (Events) | Defines the types of events that a recognition context can receive. |
SpSharedRecognizer | Represents a speech recognition engine. |
SpTextSelectionInformation | Provides access to the text selection information pertaining to a word sequence buffer. |
SpUnCompressedLexicon | Provides access to lexicons, which contain information about words that can be recognized or spoken. |
SpVoice | Enables an application to perform text synthesis operations. |
SpVoice (Events) | defines the types of events that can be received by an SpVoice object. |
SpWaveFormatEx | Defines the format of waveform-audio data. |