Grammar Compiler interfaces

Microsoft Speech SDK

The Microsoft.com Speech website Microsoft Speech SDK SAPI 5.1

Grammar Compiler Interfaces (API-level)

Many speech recognition applications are built on voice commands, or command and control (C and C). For example, users playing Solitaire using a graphical application, may want to add C and C. This enables the user to speak "new game" or "play the ace of spades" into their computer microphone instead of using menu options or keyboard accelerators. Microsoft Office XP has a C and C mode, which enables user to speak voice commands mapped to virtually every menu command and many parts of the user interface (e.g., "File New", "Tools Options", "Outlook Today", etc.).

Applications can use SAPI 5's C and C features to implement functionality similar to a voice-command enabled Solitaire, Office XP, and other innovative applications. The C and C features of SAPI 5 are implemented as context-free grammars (CFGs). A CFG is a structure that defines a specific set of words, and the combinations of these words that can be used. In basic terms, a CFG defines the sentences that are valid, and in SAPI 5, defines the sentences that are valid for recognition by a speech recognition (SR) engine.

The CFG format in SAPI 5 defines the structure of grammars and grammar rules using Extensible Markup Language (XML). The CFG/Grammar compiler transforms the XML tags defining the grammar elements into a binary format used by SAPI 5-compliant SR engines. This compiling process can be performed either before or during application run time.

The Speech SDK includes a grammar compiler, which can be used to author text grammars, compile text grammars into the SAPI 5 binary format, and perform basic testing before integration into an application. Also see the SDK Sample: Grammar Compiler.

SAPI 5 also enables applications to create CFG structures programmatically using the ISpGrammarBuilder interface, which is inherited by ISpRecoGrammar. The application can use the ISpGrammarBuilder API to dynamically update an already loaded SAPI 5 XML grammar, create an in-memory SAPI 5 grammar, and/or save an in-memory SAPI 5 grammar to a memory stream (e.g., for saving grammars to the hard disk).

The following section covers:

  • Text Grammar Format: SAPI 5-defined XML grammar format for defining a CFG with plain text.
  • ISpGrammarBuilder: SAPI 5 API for programmatically creating, editing, or saving in-memory and binary CFGs.

Applications that do not need to modify a grammar at run time, or applications that want to increase performance of their CFG-based application should load the compiled binary form statically (not dynamically). If loading the backend grammar compiler at application run time, note that SAPI must allow for modification and validation of complicated state/transition graphs.