grammar Element (Microsoft.Speech)

Microsoft Speech Platform SDK 11

Collapse image Expand Image Copy image CopyHover image

Specifies the highest level container for an XML grammar definition. This element is required to make a valid grammar.

Syntax

XML Copy imageCopy Code
<grammar
   version = "1.0"
   mode = "voice | dtmf"
   root = "string"
   tag-format = (semantics/1.0 | semantics-ms/1.0 | semantics/1.0-literals)
   xml:lang = "language code-country/region code"
   xml:base = "grammarBaseUri"
   xmlns = "http://www.w3.org/2001/06/grammar"
   xmlns:sapi= "http://schemas.microsoft.com/Speech/2002/06/SRGSExtensions">
   sapi:alphabet= (ipa | x-microsoft-ups | x-microsoft-sapi)
</grammar>

Attributes

Attribute

Description

version

Required. Specifies the version number of the Speech Recognition Grammar Specification used. The only accepted value is 1.0.

mode

Optional. Specifies the mode of the contained or referenced grammar. The mode can be one of the following values.

  • voice for spoken input

  • dtmf for dual tone multi-frequency (DTMF) input

If omitted, the default value is voice.

root

Optional. Specifies the name of the default grammar rule. If omitted, the grammar passes validation checks and compiles, but does not trigger recognition. The rule declared as the root rule must be defined within the scope of the grammar. The root rule can be scoped as either public or private.

tag-format

Required if a grammar contains tag elements, this attribute specifies the content type of all tag elements contained within a grammar. This attribute takes one of the following values:

  • semantics/1.0 declares that the content within tag elements is ECMAScript.

  • semantics-ms/1.0 declares that the content within tag elements is ECMAScript as implemented by Microsoft.

  • semantics/1.0-literals declares that the content within tag elements is a boolean, an integer, a float, or a string. A string CANNOT be enclosed in double quotes.

xml:lang

Required if the value of the mode attribute is voice, optional if the value of the mode attribute is dtmf. Declares the single language for the content of the containing grammar document. The value may contain either a lower-case, two-letter language code, (such as "en" for English or "fr" for French) or may optionally include an upper-case, country/region or other variation in addition to the language code. Examples with a county/region code include "es-US" for Spanish as spoken in the US, or "fr-CA" for French as spoken in Canada. See the Remarks section for additional information.

xml:base

Optional. Specifies a grammar document's base Uniform Resource Identifier (URI). The value for xml:base is used to resolve relative URIs in a grammar document. For example, a grammar file declares:
xml:base="http://www.contoso.com/"
and contains a relative reference to another document, for example:
<ruleref uri="ExternalGrammar.grxml">
This creates the following absolute path to the document:
http://www.contoso.com/ExternalGrammar.grxml.

xmlns

Required. Specifies the XML namespace for W3C speech recognition grammar. The XML namespace is http://www.w3.org/2001/06/grammar.

xmlns:xsi

Optional. Defines a namespace prefix. Required only when the grammar uses the sapi:alphabet attribute, the sapi:dynamic attribute of the rule element, or the sapi:pron attribute of the token element. When used, attribute and value must appear as follows: xmlns:sapi="http://schemas.microsoft.com/Speech/2002/06/SRGSExtensions"

sapi:alphabet

Optional. Specifies the phonetic alphabet to use for pronunciations defined in the sapi:pron attribute of token elements. Valid values are ipa, x-microsoft-ups, and x-microsoft-sapi. When using sapi:alphabet, the grammar element must contain the following declaration: xmlns:sapi="http://schemas.microsoft.com/Speech/2002/06/SRGSExtensions"

Remarks

The model and syntax indicated by the tag-format values semantics/1.0 and semantics/1.0-literals are defined in the W3C specification recommendation Semantic Interpretation for Speech Recognition (SISR) Version 1.0. The tag-format value semantics-ms/1.0 indicates a model and syntax defined by Microsoft. See Support for Semantic Markup for more information.

The content of tag elements in a grammar must be of the type declared in the grammar element's tag-format attribute. Using a string literal syntax when the value of tag-format is semantics/1.0 or semantics-ms/1.0 will generally result in a runtime error. Using the ECMAScript syntax when the value of tag-format is semantics/1.0-literals will not produce a runtime error, but will erroneously populate Rule Variables with ECMAScript code. See tag Element (Microsoft.Speech) for more information and examples that use the syntax for each of the values of tag-format.

For a given language code declared in the xml:lang attribute, a speech recognition engine that supports that language code must be installed for the grammar to be loaded successfully. The Microsoft Speech Platform Runtime 11 and Microsoft Speech Platform SDK 11 do not include any installed languages. You must download a language pack for each language on which you want to perform speech recognition.

The language packs are different for each version of the Speech Platform Runtime. You must download the language pack version that matches the version of the Speech Platform Runtime that you have installed. The language packs for the Speech Platform SDK 11 are for server-based applications and are different than the languages that ship with Windows Vista or Windows 7. Use the following links to download the version of the server language packs that match your Speech Platform Runtime version:

See Language Support for a list of languages for which you can download language packs.

The Speech Platform SDK 11 accepts all valid language-country codes. If the grammar element specifies only a language code, and not a country/region code, for the xml:lang attribute (such as xml:lang="en"), then any installed recognizer that expresses support for that generic, region-independent language will be able to load the grammar. See Language Identifier Constants and Strings for a comprehensive list of language codes.

The Speech Platform SDK 11 does not currently support grammars that specify multiple languages. This is a departure from the Speech Recognition Grammar Specification (SRGS) Version 1.0, which allows for a grammar processor to optionally support multiple languages. For example, the SDK does not permit a grammar such as the one shown in the following example:

XML Copy imageCopy Code
<?xml version="1.0" encoding="utf-8"?>
<grammar version="1.0" xml:lang="en-GB" xmlns="http://www.w3.org/2001/06/grammar" root="Digits">
  <rule id="Digits">
    <one-of>
      <item xml:lang="fr-FR">deux</item>
    </one-of>
  </rule>
</grammar>

To support multiple languages for your applications, you can use multiple grammars in parallel, each with a separate single language. An application's recognition engine may load and independently enable or disable one or more grammar files.

NoteNote

The Speech Platform SDK 11 does support multiple languages in Speech Synthesis Markup Language (SSML) documents used to create prompts for synthesized speech. See speak Element (Microsoft.Speech)

Example

The following is an example of a simple grammar that declares all the attributes of the grammar element:

XML Copy imageCopy Code
<?xml version="1.0" encoding="utf-8"?>
<grammar 
   version="1.0" mode="voice" root="Welcome"
   tag-format="semantics/1.0" xml:lang="en-US"
   xml:base="http://www.contoso.com/"
   xmlns="http://www.w3.org/2001/06/grammar"
   xmlns:sapi="http://schemas.microsoft.com/Speech/2002/06/SRGSExtensions">

<rule id="Welcome">
   <item>
      Welcome to the managed code API for speech on servers.
   </item>
</rule>

</grammar>