SpLexicon

Microsoft Speech SDK

The Microsoft.com Speech website Microsoft Speech SDK SAPI 5.1

SpLexicon

SpLexicon
ISpLexicon

The Lexicon database is a repository of words and word-related information such as pronunciations and parts of speech. The SAPI lexicon interface provides application, CSR engine, and TTS engine developers a standard method with which to create, access, modify, and synchronize with lexicons.

Types of Lexicons

There are two types of custom lexicons supported by lexicon interface: user and application. The user lexicon stores words specific to a user. It is a read/write lexicon and is shared among all applications. The application lexicon is supplied by the application and stores words specific to the application. The application supplied lexicons are read-only. Application lexicons ensure that the vocabulary used by the application is well represented in the lexicon.

Apart from custom lexicons, the lexicon interface provides access to vendor, morph, and letter-to-sound lexicons that Microsoft ships with SAPI. Vendor lexicons are large vocabulary lexicons holding words and their pronunciations and parts of speech. The morph lexicons derive pronunciations using the data in the vendor lexicon. The letter-to-sound lexicon computes the pronunciation of a word from its spelling.

User lexicons override application lexicons and engine private lexicons. You cannot change application lexicons from the SpLexicon object.

Modifying and Viewing the Contents of a Lexicon

An application can modify the user lexicon using the calls ISpLexicon::AddPronunciation and ISpLexicon::RemovePronunciation. The function ISpLexicon::GetWords enables the caller to see what words are in the user or application lexicon. To obtain the pronunciation of a given word, the client would call ISpLexicon::GetPronunciations. There is not a standard method for applications to access the lexicons that are supplied by the engine.

Synchronizing Changes to a Lexicon

The lexicon interface provides methods to synchronize changes in lexicons using a lexicon generation ID, which is a sort of time-stamp on the lexicon. These changes in the lexicon are a result of modifications to user lexicons or for the installation or uninstallation of application lexicons. The client can get the current generation by calling ISpLexicon::GetGeneration and can see the change history since a given generation by calling ISpLexicon::GetGenerationChange. A speech recognition engine might want to use the synchronization to update its private stores with the changes made to the custom lexicons while the client has been offline. For example, SR engines can update their language models with changes made to the custom lexicons while the SR engine had been offline.

How Created

An SpLexicon can be created by calling ::CoCreateInstance with CLSID_SpLexicon.