Grammar Format Tags (Microsoft Speech Platform)

Microsoft Speech Platform SDK 11

previous page next page

Microsoft Speech Platform

Grammar Format Tags

The SAPI text grammar format is composed of XML tags, which can be structured to define the phrases that the speech recognition engine recognizes. The following document explains each tag in more detail, including sample source code, sample XML grammar snippets, and relevant application scenarios.

The XML tags descriptions are organized by XML element, where each element description contains information for relevant attributes.

XML Tags: Elements

<DEFINE>

Summary: The DEFINE tag is used for declaring a set of string identifiers for numeric values.

XML Attributes:
	None

XML Parent Elements:
	GRAMMAR: The container for the entire XML grammar.

XML Child Elements:
	ID (1 or more required): The DEFINE tag can contain one or more ID tags, each
			of which defines one string identifier.

Detailed Description:
	None

XML Grammar Sample(s):
	<GRAMMAR>
		<DEFINE>
			<ID NAME="TheNumberFive" VAL="5"/>
		</DEFINE>

		<!-- Note that the ID takes a number, which is actually "5" -->
		<RULE ID="TheNumberFive" TOPLEVEL="ACTIVE">
			<P>five</P>
		</RULE>
	</GRAMMAR>

Programmatic Equivalent:
	See the ID tag.

return to the top of this page Back to top

<DICTATION>

Summary: The DICTATION tag is used in rules or phrases that need basic dictation support.

XML Attributes:
	MAX (optional, type=VT_I4, default=MIN): Specifies the maximum number of dictation
		words that can be recognized.
		The application must specify a MAX value that is greater than or equal to
			the MIN value. The application can specify a pseudo-infinite maximum
			by specifying INF as the MAX. The pseudo-infinite is actually 255
			dictation words.
		An application that needs free-form dictation, such as the subject line of
			an email should use a large MAX. Alternatively, an application that
			needs to recognize a person's name may want a much smaller value,
			such as 5 words.
	MIN (optional, type=VT_I4, default=1): Specifies the minimum number of dictation
			words that must be recognized.
		If the grammar author specifies the MIN value, and the recognizer does not
			meet the minimum, the rule will fail to be recognized.
		A Scenario where it may make sense to set a value greater than one would be
			an application that is asking for a first and last name.
	PROPID (optional, type=VT_I4): Specifies the semantic property's numeric identifier.
	PROPNAME (optional): Specifies the semantic property's string identifier.

XML Parent Elements:
	LIST, L: List of phrases which can be recognized.
	PHRASE, P: Phrase that must be recognized for the containing rule to be recognized.
	OPT, O: Optional phrase that may be recognized.
	RULE: Rule that contains phrases or text to be recognized.

XML Element Children:
	None.

Detailed Description:
	The DICTATION tag is designed for applications that need to integrate command &
		control and dictation support into a CFG. For example, an application may
		allow the user to speak free-form dictation into a command (e.g. "save document
		as our family's budget" where "our family's budget is free-form dictation).
	The application may also create a CFG which supports a set of specific phrases or words,
		and also includes a single DICTATION tag in case of an unexpected user-phrase.
		For example, a CFG may include a set of address book names which are known, and
		if the user speaks another name, then the application prompts the user for
		validation of the dictated result. Note that the SR engine's accuracy may
		suffer by mixing dictation and CFG phrases together, since many words sound
		similar, and a CFG is generally preferred for application development with known
		words.
	The grammar author can also use a special character, asterisk (*) instead of the entire
		XML tag. See XML Grammar Format: Special Dictation Tag.
	By using semantic properties, the application can easily retrieve the exact text that
		was dictated by the speaker. To specify a semantic property for the DICTATION tag
		the grammar author should specify the PROPID and/or PROPNAME attributes. The
		SAPI run time will automatically set the semantic tag's starting phrase element,
		allowing the application to search for the specific semantic property in the
		properties hierarchy (see SPPHRASEPROPERTY.ulFirstElement). If multiple dictation
		words are recognized by the SR engine (e.g. DICTATION MAX > 1),	then the SAPI
		run time will generate multiple semantic properties, one for each word, where
		all of the properties will have the same numeric ID and/or string NAME.
	If the speech recognition engine supports multiple dictation topics (e.g. spelling,
		general, legal, medical, etc.), the DICTATION tag in the grammar will refer to
		topic that was selected when ISpRecoGrammar::LoadDictation was called. If the
		topic was not explicitly selected, then the default SR engine dictation topic
		will be loaded. Currently, it is not possible to load multiple dictation topics
		inside of a single command & control grammars. Application should create multiple
		grammar objects to implement the latter scenario.
	If there is ambiguity between a dictation phrase and a CFG phrase, the speech
		recognition engine will typically choose the CFG phrase. Preferring CFGs over
		dictation prevents dictation from automatically consuming all CFG phrases.
	The speech recognition engine must support dictation inside of a CFG for the grammar
		to load and activate successfully. The application can determine if an engine
		supports the DICTATION tag by retrieving the SR engine's object token (see
		ISpRecognizer::GetRecognizer), and then checking for the existence of the
		engine attribute "DictationInCFG" (see ISpObjectToken::MatchesAttributes).
		The engine can specify support for the DICTATION tag to be anywhere in the
		CFG phrase (attribute value="Anywhere"), or only at the end (attribute
		value="Trailing").

XML Grammar Sample(s):
	<GRAMMAR>
		<!-- basic command to create a self-note for the user with free-form text -->
		<RULE ID="SelfNote" TOPLEVEL="ACTIVE">
			<P>note to self</P>
			<DICTATION MAX="INF"/>
		</RULE>

		<!-- command to query a name from an address book -->
		<RULE ID="QueryName" TOPLEVEL="ACTIVE">
			<P>list first names of all persons with last name</P>
			<!-- Store only one word for the last name, more will fail command -->
			<DICTATION MAX="1">
		</RULE>

		<!-- command to handle first and last names with semantic properties -->
		<!-- By using semantic properties, the application can ignore all of
			the text returned, except for the text associated with the dictation
			tags' semantic properties "PID_FirstName" and "PID_LastName" -->
		<RULE ID="SubmitName" TOPLEVEL="ACTIVE">
			<P>
				my first name is
				<!-- Note the implicit maximum is only one word -->
				<DICTATION PROPID="PID_FirstName"/>
				and my last name is
				<!-- Note the implicit maximum is two words -->
				<DICTATION PROPID="PID_LastName" MAX="2"/>
			</P>
		</RULE>
	</GRAMMAR>

Programmatic Equivalent:
	To programmatically create a dictation transition (i.e. DICTATION tag) in a CFG, the application developer
		can use the ISpGrammarBuilder::AddRuleTransition with a special rule handle,
		called SPRULETRANS_DICTATION. For example, the following code creates a simple
		command called "SendMail" which recognizes the command "send mail to DICTATION".

			SPSTATEHANDLE hsSendMail;
			// Create new top-level rule called "SendMail"
			hr = cpRecoGrammar->GetRule(L"SendMail", NULL,
							SPRAF_TopLevel | SPRAF_Active, TRUE,
							&hsSendMail);
			// Check hr

			// Create an interim state before the dictation transition
			SPSTATEHANDLE hsBeforeDictation;
			hr = cpRecoGrammar->CreateNewState(hsSendMail, &hsBeforeDictation);
			// Check hr

			// Add the command words "send mail to"
			hr = cpRecoGrammar->AddWordTransition(hsSendMail, hsBeforeDictation,
					L"send mail to", L" ", SPWT_LEXICAL, 1.0f, NULL);
			// Check hr

			// Add trailing dictation transition
			hr = cpRecoGrammar->AddRuleTransition(hsBeforeDictation, NULL,
								SPRULETRANS_DICTATION, NULL, NULL);
			// Check hr

			// save/commit changes
			hr = cpRecoGrammar->Commit(NULL);
			// Check hr

	Note that the previous sample code only supports one dictation word. To support
		more than one word, the code would need to build more dictation transition
		states, each of which begins at the previous dictation state - effectively,
		a series of consecutive single-word dictation transitions.

return to the top of this page Back to top

<GRAMMAR>

Summary: The GRAMMAR tag is the outermost container for the XML grammar definition.

XML Attributes:
LANGID (optional, type=numeric): The language identifier of the grammar.
The identifier will be compared against the
the supported languages of the Speech Recognition engine. If the language is
not supported, the grammar load call will fail (e.g. ISpRecoGrammar::LoadCmdFromFile).
It is recommended that all XML grammars include the LANGID attribute to avoid the scenario
where the SR engine tries to load a grammar with an unspecified language ID, and fails due
to confusing words.
SAPI supports fuzzy language ID matching, in that the SR engine can
report that is supports the major portion of the Language ID (e.g. 0x009 in 0x409),
which means the SR engine will try to load and recognize any grammar that matches the
major portion of the language ID.
LEXDELIMITER (optional): The LEXDELIMITER attribute specifies the delimiter for explicit
lexicon entries specified in the grammar.
Grammar authors are able to specify the lexicon information by using a special
sequence of characters. The sequence of characters is:
LEXDELIMITERDisplayFormLEXDELIMITERLexicalFormLEXDELIMITERPronunciation;
The default delimiter is the backslash character "/".
See also PHRASE.
WORDTYPE (optional): The WORDTYPE attribute specifies the type of the word(s) when they are added to the
grammar.
The default value is "LEXICAL".
The value must be "LEXICAL".

XML Parent Elements:
	None

XML Child Elements:
	DEFINE (optional): Specifies the constant definitions for the grammar.
	RULE (1 or more required): Specifies the rules, including top-level and non-top-level.

Detailed Description:
	Every XML grammar must have the container tag, GRAMMAR.

XML Grammar Sample(s):
	<!-- Language ID = British English -->
	<GRAMMAR LANGID="413" LEXDELIMITER="|" WORDTYPE="LEXICAL">
		<RULE NAME="HelloWorld" TOPLEVEL="ACTIVE">
                        <!-- when the user says the following pronunciation, "Hiya" will be displayed -->
			<P>|Hiya|Hello|h eh l ow;</P>
		</RULE>
	</GRAMMAR>

Programmatic Equivalent:
	To programmatically set the language ID of a new grammar, the application developer should
             call ISpGrammarBuilder::ResetGrammar.
	The application developer does not need to change the LEXDELIMITER or the WORDTYPE, since the ISpLexicon
	     interface can be used to modify the lexicon.

return to the top of this page Back to top

<ID>

Summary: The ID tag is used for declaring a string identifier for numeric
values.

XML Attributes:
	NAME (required): The NAME attribute defines the string identifier that will be associated
                             with the constant value.
	VAL (required, type=VT_UI4,VT_I4,VT_R4,VT_R8): The VAL attribute defines the constant
				value that will be associated with the string identifier.

XML Parent Elements:
	DEFINE: The container for the constant definitions.

XML Child Elements:
	None

Detailed Description:
	The ID tag should be used by grammar author to make the grammar easier to read and
             maintain. The grammar author can use string identifiers which succinctly explain
             the use of the identifier (e.g. RID_FileNew, PVAL_MAIN_WINDOW, etc.). The grammar
             compiler stores the identifiers in the binary format, and string identifiers are
             typically much larger than numeric identifiers. Also, the application developer
             can use a simple numeric comparison to handle rule and semantic property logic,
             rather than performing a more complex string comparison.

XML Grammar Sample(s):

	<GRAMMAR>
		<DEFINE>
			<ID NAME="RuleId_A" VAL="1"/>
			<ID NAME="PropId_B" VAL="2"/>
			<ID NAME="PropVal_AB" VAL="3"/>
		</DEFINE>

		<!-- Note that Rule ID, Phrase PROPID and VAL take a numeric values. -->
		<RULE ID="RuleId_A" TOPLEVEL="ACTIVE">
			<P PROPID="PropId_B" VAL="PropVal_AB">five</P>
		</RULE>
	</GRAMMAR>

Programmatic Equivalent:
	The Grammar Compiler that ships in the Microsoft Speech SDK includes a command line
              argument to generate a C-style header (see "-h"), which includes the programmatic
              constant definitions for all of the IDs defined in the XML grammar. The application
              developer can include the header file and easily use the same identifiers inside
              the application logic, without needing to redefine and maintain the numeric values.
	The XML Grammar Sample above would create the following C-style header file:
		#define RuleId_A 1
		#define PropId_B 2
		#define PropVal_AB 3

return to the top of this page Back to top

<LIST>, <L>

Summary: The LIST tag is used for specifying a list of phrases or transitions.

XML Attributes:
	PROPID (optional, type=VT_I4): The numeric identifier that will be inherited by all
			semantic properties in the child elements (e.g. phrases).
	PROPNAME (optional): The string identifier that will be inherited by all semantic
			properties in the child elements (e.g. phrases).

XML Parent Elements:
	LIST, L: List of phrases or rules which can be recognized.
	PHRASE, P: Phrase that must be recognized for the containing rule to be recognized.
	OPT, O: Optional phrase causing the rule reference to be implicitly optional.
	RULE: Rule that contains phrases or text to be recognized.

XML Child Elements:
	RULEREF: Import, or reference, another rules contents
	PHRASE, P: Specifies text or leaf nodes.
	LIST, L: Specifies a list of phrases or transitions for recognition.
	TEXTBUFFER: Specifies a reference to the run-time application maintained
		text-buffer.
	WILDCARD: Specifies a garbage word; one or more non-silence, ignorable words
	DICTATION: Specifies a piece of text recognized by the loaded dictation topic.

Detailed Description:
	The LIST tag is a quick and efficient way to support lists of phrases or text. Instead
		of creating separate rules for each piece of text, the LIST tag can be used
		where its children are the phrase, rule reference, or other tags.
	The grammar author can use the shorthand version of the LIST tag, the L tag.
	The LIST tag is more of a virtual tag, since it does not affect the semantic property
		hierarchy (LIST children are not child properties). While it allows the grammar
		author to specify a string or numeric identifier, the identifier is only used
		to pass on to the child element as a default property identifier.

XML Grammar Sample(s):
	<GRAMMAR>
		<!-- Note that rule is not top-level and is only used as a reusable component rule -->
		<RULE NAME="Numbers">
			<!-- The list tag includes a semantic property Id, "PID_Value" which
				is inherited by all child phrase elements -->
			<LIST PROPID="PID_Value">
				<!-- If the user says "one" then the semantic property returned will
					be the name/value pair "PID_Value"/"1" -->
				<P VAL="1">one</P>
				<P VAL="2">two</P>
				<P VAL="3">three</P>
				<P VAL="4">four</P>
				<P VAL="5">five</P>
			</LIST>
		</RULE>

		<!-- The rule contains a list of various types of transitions -->
		<RULE NAME="Sampler" TOPLEVEL="ACTIVE">
			<!-- the list property specifies a default property name of "TYPE_NUMBER",
				which will overridden by specific list children -->
			<LIST PROPNAME="TYPE_NUMBER">
				<P VAL="1">one</P>
				<P VAL="2">two</P>
				<P VAL="3">three</P>
				<P PROPNAME="TYPE_STRING" VALSTR="FOUR">four</P>
				<P PROPNAME="TYPE_NONE">five</P>
				<RULEREF NAME="Numbers" PROPNAME="TYPE_RULEREF"/>
				<TEXTBUFFER PROPNAME="TYPE_TEXTBUFFER"/>
				<DICTATION PROPNAME="TYPE_DICTATION"/>
			</LIST>
		</RULE>
	</GRAMMAR>

Programmatic Equivalent:
	To programmatically create a list, or a set of sibling/parallel transitions, the application
		needs to create a start state, then create multiple transitions out of the state. For
		example, the following sample code shows how to make a list of phrases (e.g. "one",
		"two", "three").

			SPSTATEHANDLE hsList;
			// Create new top-level rule called "List"
			hr = cpRecoGrammar->GetRule(L"List", NULL,
							SPRAF_TopLevel | SPRAF_Active, TRUE,
							&hsList);
			// Check hr

			// Add the word "one" to the list
			hr = cpRecoGrammar->AddWordTransition(hsList, NULL,
					L"one", L" ",
					SPWT_LEXICAL, 1.0f, NULL);
			// Check hr

			// Add the word "two" to the list
			hr = cpRecoGrammar->AddWordTransition(hsList, NULL,
					L"two", L" ",
					SPWT_LEXICAL, 1.0f, NULL);
			// Check hr

			// Add the word "three" to the list
			hr = cpRecoGrammar->AddWordTransition(hsList, NULL,
					L"three", L" ",
					SPWT_LEXICAL, 1.0f, NULL);
			// Check hr

			// save/commit changes
			hr = cpRecoGrammar->Commit(NULL);
			// Check hr

		The application developer can use similar code to create a list of rule references,
			dictation, or text buffer transitions. To change the type of list item, change
			the ::AddWordTransition call to ::AddRuleTransition.

return to the top of this page Back to top

<OPT>, <O>

Summary: The OPT tag is used for specifying optional text in a command phrase.

XML Attributes:
	DISP (optional): Specifies the display form of the phrase text.
	MAX (optional, type=VT_I4, default=MIN): Specifies the maximum number of times the user
		can repeat the phrase and still be successfully recognized.
	MIN (optional, type=VT_I4, default=1): Specifies the minimum number of times the user
		must repeat the phrase and still be successfully recognized.
	PRON (optional): Specifies the pronunciation to be used by the recognizer when listening
		for the text.
	PROPID (optional, type=VT_I4): Specifies the numeric identifier to associate with the phrase
		tag's semantic property.
	PROPNAME (optional): Specifies the string identifier to associate with the phrase tag's
		semantic property.
	VAL (optional, type=VT_I4): Specifies the semantic property's numeric value.
	VALSTR (optional): Specifies the semantic property's string value.
	WEIGHT (type=VT_UI4,VT_I4,VT_R4,VT_R8, default=1/n_sibling_transitions): The probability
			that the user will speak the contents of the PHRASE tag, versus another
			sibling transition or phrase.

XML Parent Elements:
	RULEREF: Import, or reference, another rules contents
	PHRASE, P: Specifies text or leaf nodes.
	OPT, O: Optional phrase causing the rule reference to be implicitly optional.
	LIST, L: Specifies a list of phrases or transitions for recognition.
	TEXTBUFFER: Specifies a reference to the run-time application maintained
		text-buffer.
	WILDCARD: Specifies a garbage word; one or more non-silence, ignorable words
	DICTATION: Specifies a piece of text recognized by the loaded dictation topic.

XML Child Elements:
	RULEREF: Import, or reference, another rules contents
	PHRASE, P: Specifies text or leaf nodes.
	OPT, O: Optional phrase causing the rule reference to be implicitly optional.
	LIST, L: Specifies a list of phrases or transitions for recognition.
	TEXTBUFFER: Specifies a reference to the run-time application maintained
		text-buffer.
	WILDCARD: Specifies a garbage word; one or more non-silence, ignorable words
	DICTATION: Specifies a piece of text recognized by the loaded dictation topic.

Detailed Description:
	The OPT tag along with the OPT tag are the only tags that can directly
		contain recognizable text.
	The grammar author can use the shorthand version of the OPT tag, the O tag.
	The grammar author can also specify custom word pronunciations and display
		text by using the PRON and DISP attributes. For example, a grammar
		might contain application or domain specific text, which has a custom
		pronunciation. The author can specify the pronunciation on a specific
		OPT tag to avoid the need for updating the user or application
		lexicon (especially if the pronunciation is command specific).
	The grammar author can also use special shorthand characters inside of the
		content section of the PHRASE tag (e.g. dictation, wildcard, etc.). See
		the XML Special Characters.

XML Grammar Sample(s):
	<GRAMMAR>
		<!-- Create a simple "hello world" rule -->
		<!-- the second word is optional -->
		<RULE NAME="HelloWorld" TOPLEVEL="ACTIVE">
			<P>hello</P>
			<OPT>world</OPT>
		</RULE>

		<!-- Create a rule that changes the pronunciation and the display
			form of the phrase. When the user says "eh" the display
			text will be "I don't understand?". Note the user didn't
			say "huh". The pronunciation for "what" is specific to this
			phrase tag and is not changed for the user or application
			lexicon, or even other instances of "what" in the grammar -->
		<RULE NAME="Question_Pron" TOPLEVEL="ACTIVE">
			<P DISP="I don't understand" PRON="eh">what</P>
		</RULE>

		<!-- Create a phrase with an attached semantic property -->
		<!-- Speaking "one two three" will return three different unique
			semantic properties, with different names, and different
			values -->
		<!-- Speaking "one three" will return two different unique
			semantic properties, with different names, and different
			values -->
		<!-- Speaking "one two" will return two different unique
			semantic properties, with different names, and different
			values -->
		<!-- Speaking "one" will return two different unique
			semantic properties, with different names, and different
			values -->
		<!-- Note that the number of semantic properties returned is
			variable, and that the application should be designed to
			handle all of the variations -->
		<RULE NAME="UseProps" TOPLEVEL="ACTIVE">
			<!-- named property, without value -->
			<P PROPNAME="NOVALUE">one</P>

			<!-- named property, with numeric value -->
			<O PROPNAME="NUMBER" VAL="2">two</O>

			<!-- named property, with string value -->
			<O PROPNAME="STRING" VALSTR="three">three</O>
		</RULE>

		<!-- Create a rule for optional command prefix -->
		<!-- Note that entire rule reference is optional. In cases where
			there are properties associated with the rule reference, the
			semantic property tree may change -->
		<!-- the rule supports the phrases "play cards", "please play cards", and
			"please play cards" -->
		<RULE NAME="PlayCard" TOPLEVEL="ACTIVE">
			<O><RULEREF NAME="PLEASE"/></O>
			<P>play cards</P>
		</RULE>

		<!-- The first word "pretty" is optional, while the second is required -->
		<RULE NAME="PLEASE">
			<O>pretty</O>
			<P>please</P>
		</RULE>
	</GRAMMAR>

Programmatic Equivalent:
	To add an optional phrase to a rule, SAPI provides an API called
		ISpGrammarBuilder::AddWordTransition. The application developer can add
		the optional structure as follows:

			SPSTATEHANDLE hsHelloWorld;
			// Create new top-level rule called "HelloWorld"
			hr = cpRecoGrammar->GetRule(L"HelloWorld", NULL,
							SPRAF_TopLevel | SPRAF_Active, TRUE,
							&hsHelloWorld);
			// Check hr

			// create an interim state
			SPSTATEHANDLE hInterim;
			hr = cpRecoGrammar->CreateNewState(hsHelloWorld, &hInterim);
			// Check hr

			// Add the command word "hello" which terminates at the interim
			//	state
			hr = cpRecoGrammar->AddWordTransition(hsHelloWorld, hInterim,
					L"hello", NULL,
					SPWT_LEXICAL, 1.0f, NULL);
			// Check hr

			// Add the optional command word "world"
			hr = cpRecoGrammar->AddWordTransition(hInterim, NULL,
					L"hello", NULL,
					SPWT_LEXICAL, 1.0f, NULL);
			// Check hr

			// Add the epsilon transition, which means no word need be spoken
			hr = cpRecoGrammar->AddWordTransition(hInterim, NULL,
					NULL, NULL,
					SPWT_LEXICAL, 1.0f, NULL);
			// Check hr

			// save/commit changes
			hr = cpRecoGrammar->Commit(NULL);
			// Check hr

return to the top of this page Back to top

<PHRASE>, <P>

Summary: The PHRASE tag and the OPT tags are the sole methods of explicitly specifying text to be
		recognized by the speech recognition engine.

XML Attributes:
	DISP (optional): Specifies the display form of the phrase text.
	MAX (optional, type=VT_I4, default=MIN): Specifies the maximum number of times the user
		can repeat the phrase and still be successfully recognized.
	MIN (optional, type=VT_I4, default=1): Specifies the minimum number of times the user
		must repeat the phrase and still be successfully recognized.
	PRON (optional): Specifies the pronunciation to be used by the recognizer when listening
		for the text.
	PROPID (optional, type=VT_I4): Specifies the numeric identifier to associate with the phrase
		tag's semantic property.
	PROPNAME (optional): Specifies the string identifier to associate with the phrase tag's
		semantic property.
	VAL (optional, type=VT_I4): Specifies the semantic property's numeric value.
	VALSTR (optional): Specifies the semantic property's string value.
	WEIGHT (type=VT_UI4,VT_I4,VT_R4,VT_R8, default=1/n_sibling_transitions): The probability
			that the user will speak the contents of the PHRASE tag, versus another
			sibling transition or phrase.

XML Parent Elements:
	RULEREF: Import, or reference, another rules contents
	PHRASE, P: Specifies text or leaf nodes.
	OPT, O: Optional phrase causing the rule reference to be implicitly optional.
	LIST, L: Specifies a list of phrases or transitions for recognition.
	TEXTBUFFER: Specifies a reference to the run-time application maintained
		text-buffer.
	WILDCARD: Specifies a garbage word; one or more non-silence, ignorable words
	DICTATION: Specifies a piece of text recognized by the loaded dictation topic.

XML Child Elements:
	RULEREF: Import, or reference, another rules contents
	PHRASE, P: Specifies text or leaf nodes.
	OPT, O: Optional phrase causing the rule reference to be implicitly optional.
	LIST, L: Specifies a list of phrases or transitions for recognition.
	TEXTBUFFER: Specifies a reference to the run-time application maintained
		text-buffer.
	WILDCARD: Specifies a garbage word; one or more non-silence, ignorable words
	DICTATION: Specifies a piece of text recognized by the loaded dictation topic.

Detailed Description:
	The PHRASE tag along with the OPT tag are the only tags that can directly
		contain recognizable text. Except for grammars that contain rule
		references, every grammar must have at least one PHRASE tag.
	The grammar author can use the shorthand version of the PHRASE tag, the P tag.
	The grammar author can also specify custom word pronunciations and display
		text by using the PRON and DISP attributes. For example, a grammar
		might contain application or domain specific text, which has a custom
		pronunciation. The author can specify the pronunciation on a specific
		PHRASE tag to avoid the need for updating the user or application
		lexicon (especially if the pronunciation is command specific).
	The grammar author can also use special shorthand characters inside of the
		content section of the PHRASE tag (e.g. dictation, wildcard, etc.). See
		the XML Special Characters.

XML Grammar Sample(s):
	<GRAMMAR>
		<!-- Create a simple "hello world" rule -->
		<RULE NAME="HelloWorld" TOPLEVEL="ACTIVE">
			<P>hello world</P>
		</RULE>

		<!-- Create a more advanced "hello world" rule that changes the
			display form. When the user says "hello world" the display
			text will be "Hiya there!" -->
		<RULE NAME="HelloWorld_Disp" TOPLEVEL="ACTIVE">
			<P DISP="Hiya there!">hello world</P>
		</RULE>

		<!-- Create a rule that changes the pronunciation and the display
			form of the phrase. When the user says "eh" the display
			text will be "I don't understand?". Note the user didn't
			say "huh". The pronunciation for "what" is specific to this
			phrase tag and is not changed for the user or application
			lexicon, or even other instances of "what" in the grammar -->
		<RULE NAME="Question_Pron" TOPLEVEL="ACTIVE">
			<P DISP="I don't understand" PRON="eh">what</P>
		</RULE>

		<!-- Create a rule demonstrating repetition -->
		<!-- the rule will only be recognized if the user says "hey diddle
			diddle" -->
		<RULE NAME="NurseryRhyme" TOPLEVEL="ACTIVE">
			<P>hey</P>
			<P MIN="2" MAX="2">diddle</P>
		</RULE>

		<!-- Create a list with variable phrase weights -->
		<!-- If the user says similar phrases, the recognizer will use
			the weights to pick a match -->
		<RULE NAME="UseWeights" TOPLEVEL="ACTIVE">
			<LIST>
				<!-- Note the higher likelihood that the user is
					expected to say "recognizer speech" -->
				<P WEIGHT=".95">recognize speech</P>
				<P WEIGHT=".05">wreck a nice beach</P>
			</LIST>
		</RULE>

		<!-- Create a phrase with an attached semantic property -->
		<!-- Speaking "one two three" will return three different unique
			semantic properties, with different names, and different
			values -->
		<RULE NAME="UseProps" TOPLEVEL="ACTIVE">
			<!-- named property, without value -->
			<P PROPNAME="NOVALUE">one</P>

			<!-- named property, with numeric value -->
			<P PROPNAME="NUMBER" VAL="2">two</P>

			<!-- named property, with string value -->
			<P PROPNAME="STRING" VALSTR="three">three</P>
		</RULE>
	</GRAMMAR>

Programmatic Equivalent:
	To add a phrase to a rule, SAPI provides an API called
		ISpGrammarBuilder::AddWordTransition. The application developer can add
		the sentences as follows:

			SPSTATEHANDLE hsHelloWorld;
			// Create new top-level rule called "HelloWorld"
			hr = cpRecoGrammar->GetRule(L"HelloWorld", NULL,
							SPRAF_TopLevel | SPRAF_Active, TRUE,
							&hsHelloWorld);
			// Check hr

			// Add the command words "hello world"
			// Note that the lexical delimiter is " ", a space character.
			//	By using a space delimiter, the entire phrase can be added
			// 	in one method call
			hr = cpRecoGrammar->AddWordTransition(hsHelloWorld, NULL,
					L"hello world", L" ",
					SPWT_LEXICAL, 1.0f, NULL);
			// Check hr

			// Add the command words "hiya there"
			// Note that the lexical delimiter is "|", a pipe character.
			//	By using a pipe delimiter, the entire phrase can be added
			// 	in one method call
			hr = cpRecoGrammar->AddWordTransition(hsHelloWorld, NULL,
					L"hiya|there", L"|",
					SPWT_LEXICAL, 1.0f, NULL);
			// Check hr

			// save/commit changes
			hr = cpRecoGrammar->Commit(NULL);
			// Check hr

return to the top of this page Back to top

<RESOURCE>

Summary: The RESOURCE tag is used by grammar authors who want to store arbitrary string
	data on rules (e.g. for use by a CFG, or an SR engine aware of the
	the resources).

XML Attributes:
	NAME: specifies the name of the resource to attach to the rule.

XML Parent Elements:
	RULE: The rule that contains the resource reference.

XML Child Elements:
	[CDATA] (required): The resource value is specified by a CDATA section.
	For example,
		<![CDATA[This is a test string]]>
	The RESOURCE tag contains the CDATA element, which itself contains the string.

Detailed Description:
	The RESOURCE tag is a facility allowing the grammar author to communicate
		information [attached to rules] to a CFG Interpreter or a
		speech recognition engine that is aware of the resource information.

XML Grammar Sample(s):
	<GRAMMAR>
		<!-- Note resource value can be any string -->
		<RULE ID="RID_TestResource" TOPLEVEL="ACTIVE">
			<RESOURCE NAME="AResource">
				<![CDATA[AResource's Value: String]]>
			</RESOURCE>
			<P>test an embedded resource</P>
		</RULE>
	</GRAMMAR>

Programmatic Equivalent:
	To add a resource to a rule, SAPI provides an API called
		ISpGrammarBuilder::AddResource. The application developer can add
		the aforementioned resource (see XML Grammar Sample) with the following
		code:

			SPSTATEHANDLE hsTestResource;
			// Create new top-level rule called "TestResource"
			hr = cpRecoGrammar->GetRule(NULL, RID_TestResource,
							SPRAF_TopLevel | SPRAF_Active, TRUE,
							&hsTestResource);
			// Check hr

			// Add the command words "test an embedded resource"
			hr = cpRecoGrammar->AddWordTransition(hsTestResource, NULL,
					L"test an embedded resource", L" ",
					SPWT_LEXICAL, 1.0f, NULL);
			// Check hr

			// Add the resource named "AResource"
			hr = cpRecoGrammar->AddResource(hsTestResource,
							L"AResource",
							L"AResource's Value: String");
			// Check hr

			// save/commit changes
			hr = cpRecoGrammar->Commit(NULL);
			// Check hr

		Then, the SR-Engine can retrieve the resource value when it is processing
			the rule updates or CFG-recognition by making the following call:

				// set hRule to handle with resource

				hr = cpSREngineSite->GetResource(hRule,
					L"AResource",
					&pwszResValue);
				if (S_OK == hr)
				{
					// pwszResValue contains the value
					// perform value-sensitive processing

					// release value memory
					::CoTaskMemFree(pwszResValue);
				}

return to the top of this page Back to top

<RULE>

Summary: The RULE tag is the core tag for defining which commands are available for
		recognition. Every grammar must have at least one top-level rule, and
		every rule must have at least one rule reference or recognizable text.

XML Attributes:
	DYNAMIC (optional, default is FALSE): Specifies whether the rule supports dynamic
		modifications at run time. By default, an application cannot modify rules
		in an XML grammar. To modify a rule, the rule must be marked DYNAMIC, and
		the grammar must be loaded with the dynamic flag (see ISpRecoGrammar and
		SPLOADOPTIONS). Dynamic rules cannot be marked EXPORT.
	EXPORT (optional, default is FALSE): Specifies whether the rule allows external
		grammar to reference it. For example, a grammar author that wants to allow
		other grammar author's to reuse her rules must mark each of the reusable
		rules with EXPORT="TRUE"). Exported rules cannot be marked DYNAMIC.
	ID (required, type=VT_I4): Specifies the numeric identifier of the rule. The ID
		or the NAME must be specified, or both. The identifier must be unique in
		the rule namespace, which is the entire grammar (see GRAMMAR).
	INTERPRETER (optional, default is FALSE): Specifies if the rule should use the
		CFG interpreter when it is recognized. For example,
		a rule might contain semantic properties or text that should be modified
		at run time (e.g. replace value of the semantic property named "TODAY" with
		the system's current date and time).
	NAME (required): Specifies the string identifier of the rule. The NAME
		or the ID must be specified, or both. The identifier must be unique in
		the rule namespace, which is the entire grammar (see GRAMMAR).
	TOPLEVEL (optional): Specifies that the rule is directly recognizable by a user.
		If the TOPLEVEL tag is not specified, then the rule is not recognizable
		unless it is referenced by another top-level rule structure. For example,
		component rules (see RULEREF) do not need to specify the TOPLEVEL attribute.
		When a grammar author specifies a rule as TOPLEVEL, she must also specify
		if the rule is to be enabled by default. If the rule is enabled by default
		(e.g. TOPLEVEL="ACTIVE"), then when the application activates the default
		set of rules (e.g. ISpRecoGrammar::SetRuleState(NULL, NULL, SPRS_ACTIVE)),
		then the rule will be activated. If a rule is specified as
		TOPLEVEL="INACTIVE", then it will only be activated when explicitly set to
		active (see ISpRecoGrammar::SetRuleState and
		ISpRecoGrammar::SetRuleIdState).

XML Parent Elements:
	GRAMMAR: The container for the entire XML grammar.

XML Child Elements:
	RULEREF: Import, or reference, another rules contents
	PHRASE, P: Specifies text or leaf nodes.
	LIST, L: Specifies a list of phrases for recognition.
	OPT, O: Specifies an optional piece of text that can be spoken.
	TEXTBUFFER: Specifies a reference to the run-time application maintained
		text-buffer.
	WILDCARD: Specifies a garbage word; one or more non-silence, ignorable words
	DICTATION: Specifies a piece of text recognized by the loaded dictation topic.
	RESOURCE: Specifies a labeled piece of arbitrary string data which can be
		accessed by a special SR engine, or a CFG interpreter.

Detailed Description:
	The RULE tag is the core of the XML grammar text format. The purpose of creating
		a CFG is to define a specific set of words and phrases that can be
		spoken by the user and recognized by the speech recognition engine. The
		rules can be written by the grammar author in a way that makes them
		reusable, textually maintainable, and conducive to application logic
		that is based on semantic properties or actions (not on phrase text).
	Each rule must contain at least one piece of text, or a rule reference (which
		has the same requirements). Effectively, every rule will eventually end
		with a piece of text (i.e. leaf or terminal node).
	The rule can be identified by either a numeric identifier (ID) or a string identifier
		(NAME). The grammar author can use the DEFINE tag to define constant string
		identifiers for numeric values. By using the constant string identifiers,
		the grammar author can avoid magic numbers (i.e. hard-coded numbers that can
		cause maintenance problems when updating code/grammar). See the ID tag for
		more information on constant identifiers.
	By using rule importing (references) and rule exporting, grammar authors can
		leverage reusable grammar components (e.g. numbers or date grammars).
		Similarly, grammar authors can abstract certain portions of the grammar
		text away from the semantic content by using semantic properties, or
		tags. Semantic properties are name/value pairs which are associated
		with rule nodes in the rule hierarchy, and can even contain relevant
		information from the recognized text (see SPPHRASEPROPERTY.ulStartingElement
		and SPPHRASEPROPERTY.ulCountOfElements).
	The grammar author can also use a CFG interpreter, which is a COM object that can
		re-process the semantic property tree and phrase text to modify the content
		at run time. For example, an application may load a grammar which includes
		a "days of the week" rule. By integrating a CFG interpreter with the grammar,
		the interpreter could replace the "days of the week" properties (e.g. Sunday,
		Monday, Tuesday, etc.) with the actual calendar dates relative to the
		application's host system (e.g. GetSystemTime).
	SAPI supports a feature called "semantic property pushing" which enables
		applications to detect the semantic property structure more accurately at
		recognition time. "Property pushing" is done by SAPI at compile-
		time, whereby the compiler moves semantic properties to the last terminal
		node within a rule which remains unambiguous. For example, the phrases "a b
		c d" and "a b e f g" both have prefixes of "a b". The compiler will
		automatically split the phrases into three separate phrases, "a b", "c d",
		and "e f g", where the first phrase is the common prefix to both recognizable
		phrases. The purpose of this feature is to enable applications that place
		properties on the phrases, will be able to detect which branch is being
		hypothesized as soon as the first unambiguous (non-common) portion of the
		phrase is spoken. When the user speaks "a b" it is not clear if the user will
		say "a b c d" or "a b e f g". If the user then says "e", the application
		can obviously eliminate the "a b c d" option. If the grammar author attached
		properties to the end of both phrases, the semantic property would be
		returned as soon as the user spoke the first unambiguous portion of the text
		(e.g. "c" or "e"). See Semantic Properties, Hypotheses, and "Property Pushing."

XML Grammar Sample(s):
	<GRAMMAR>
		<DEFINE>
			<ID NAME="RID_Hello" VAL="1"/>
			<ID NAME="RID_World" VAL="2"/>
			<ID NAME="RID_AddNumbers" VAL="3"/>
			<ID NAME="RID_Numbers" VAL="4"/>
			<ID NAME="RID_Numbers_Exportable" VAL="5"/>
			<ID NAME="RID_Names" VAL="6"/>
		</DEFINE>
		<!-- create a simple top-level rule that uses a constant defined identifier -->
		<RULE ID="RID_Hello" TOPLEVEL="ACTIVE">
			<P>hello</P>
		</RULE>

		<!-- Create a simple top-level rule that is inactive by default -->
		<RULE NAME="Hiya" TOPLEVEL="INACTIVE">
			<P>hiya</P>
		</RULE>

		<!-- Create a rule, which a CFG-interpreter can re-process to modify the semantic
			properties -->
		<RULE NAME="InterpretedRule" TOPLEVEL="ACTIVE" INTERPRETER="TRUE">
			<P PROPNAME="TODAY">what is today's date</P>
		</RULE>
		
		<!-- Create a simple top-level rule that references another non top-level rule -->
		<RULE ID="RID_AddNumbers" TOPLEVEL="ACTIVE">
			<P>add</P>
			<RULEREF REFID="RID_Numbers"/>
			<P>to</P>
			<RULEREF REFID="RID_Numbers"/>
		</RULE>

		<!-- Note that rule is not top-level and is only used as a reusable component rule -->
		<RULE ID="RID_Numbers">
			<LIST PROPID="PID_Value">
				<P VAL="1">one</P>
				<P VAL="2">two</P>
				<P VAL="3">three</P>
				<P VAL="4">four</P>
				<P VAL="5">five</P>
			</LIST>
		</RULE>

		<!-- mark the rule as dynamic so the application can update the list of names
			at runtime -->
		<RULE ID="RID_Names" DYNAMIC="TRUE">
			<LIST>
				<P>bob</P>
				<P>jane</P>
				<P>kate</P>
				<P>tom</P>
			</LIST>
		</RULE>

		<!-- Mark the rule as exportable, so other external grammars can access it -->
		<RULE ID="RID_Numbers_Exportable" EXPORT="TRUE">
			<LIST PROPID="PID_Value">
				<P VAL="6">six</P>
				<P VAL="7">seven</P>
				<P VAL="8">eight</P>
				<P VAL="9">nine</P>
				<P VAL="10">ten</P>
			</LIST>
		</RULE>
	</GRAMMAR>

Programmatic Equivalent:
	Application developers can programmatically add rules to a grammar by using the
		ISpGrammarBuilder interface inherited by ISpRecoGrammar. The following sample code
		shows how to add a rule to a grammar. To choose the rule attributes, see the
		ISpGrammarBuilder::GetRule method and SPCFGRULEATTRIBUTES.

		SPSTATEHANDLE hHelloWorld;
		// Create new rule called "HelloWorld"
		// Note that the second parameter is the ID, which can also be specified
		// Note also that the rule is marked as top-level and active
		hr = cpRecoGrammar->GetRule(L"SpeakNumber", NULL, SPRAF_TopLevel | SPRAF_Active,
								TRUE, &hHelloWorld);
		// Check hr

		// add the text "hello world"
		hr = cpRecoGrammar->AddWordTransition(hHelloWorld, NULL, L"hello world",
							L" ", SPWT_LEXICAL, 1.0f, NULL);
		// Check hr

		// save the grammar changes
		hr = cpRecoGrammar->Commit(NULL);
		// Check hr

	The following sample code shows how to modify a rule in an existing grammar. Specifically,
		the code will update the list of names rule shown in the XML Sample Grammar
		section. By updating the names rule, all rules that reference the names will
		automatically be able to recognize the updated names (after calling ::Commit).

		SPSTATEHANDLE hNames;

		// Get a handle to the existing rule
		// Note the use of the constant identifier RID_Names, which was defined in the
		//	XML sample. See the ID tag for information on generating a C-style header
		hr = cpRecoGrammar->GetRule(NULL, RID_Names, NULL, TRUE, &hNames);
		// Check hr

		// clear the rule to update the entire list
		hr = cpRecoGrammar->ClearRule(hNames);
		// Check hr

		// add name "sally"
		hr = cpRecoGrammar->AddWordTransition(hNames, NULL, L"sally", NULL,
							SPWT_LEXICAL, 1.0f, NULL);
		// Check hr

		// add name "jim"
		hr = cpRecoGrammar->AddWordTransition(hNames, NULL, L"jim", NULL,
							SPWT_LEXICAL, 1.0f, NULL);
		// Check hr
		// add name "diane"
		hr = cpRecoGrammar->AddWordTransition(hNames, NULL, L"diane", NULL,
							SPWT_LEXICAL, 1.0f, NULL);
		// Check hr

		// save grammar changes
		hr = cpRecoGrammar->Commit(NULL);
		// Check hr

return to the top of this page Back to top

<RULEREF>

Summary: The RULEREF tag is used for importing rules from the same grammar, or another
		grammar. The RULEREF tag is especially useful for reusing component or
		off-the-shelf rules and grammars.

XML Attributes:
	NAME (required): Specifies the string identifier of the rule to reference. The NAME
		or the REFID must be specified. If both are specified, they must refer to the
		same rule.
	OBJECT (optional): Specifies the programmatic identifier (ProgId) of the COM
		object which contains the compiled grammar.
	PROPID (optional, type=VT_I4): Specifies the numeric identifier of the semantic property
		attached to the rule reference.
	PROPNAME (optional): Specifies the string identifier of the semantic property attached
		to the rule reference.
	REFID (required, type=VT_I4): Specifies the numeric identifier of the rule to reference.
		The NAME or the REFID must be specified. If both are specified, they must refer
		to the same rule.
	URL (optional): Specifies the uniform resource locator (URL) of the rule to reference.
		The URL can be prefixed by "http://", "file://", or no prefix for a relative
		address. The URL can reference either a compiled grammar (e.g. *.cfg) or an
		uncompiled XML grammar (e.g. *.xml) which will be compiled by SAPI on demand.
	VAL (optional): Specifies the numeric value that will be associated with the semantic
		property attached to the rule reference.
	VALSTR (optional): Specifies the string value that will be associated with the semantic
		property attached to the rule reference.
	WEIGHT (optional, type=VT_UI4,VT_I4,VT_R4,VT_R8, default=1/n_sibling_transitions): The
		probability of the contents of the rule (which is referenced) being spoken by
		the user.

XML Parent Elements:
	LIST, L: List of phrases or rules which can be recognized.
	PHRASE, P: Phrase that must be recognized for the containing rule to be recognized.
	OPT, O: Optional phrase causing the rule reference to be implicitly optional.
	RULE: Rule that contains phrases or text to be recognized.

XML Child Elements:
	None

Detailed Description:
	The RULEREF tag is provided to grammar authors to allow for grammar reusability, and for
		structuring semantic properties into a hierarchy.
	Grammar reusability is provided by allowing rules to reference other rules. For example,
		an independent software vendor (ISV) could developer a series of grammars that
		supported mathematic operations and easy to speak numbers. They could redistribute
		their grammars via either a web site (URL, http), a COM object (ProgId), or a
		compiled grammar. Grammar authors who want to use the ISV's grammars would only
		need to add a RULEREF tag into their grammar which referenced the appropriate
		file or resource location. Similarly, grammar authors can build basic rule
		components into their grammars (e.g. spelling, numbers, or proper names), then
		build complex commands by reusing the basic rule components (local rule reference).
	Structured, hierarchal semantic properties are built on top of RULEs and RULEREFs. All of
		the semantic properties specified inside of a rule are siblings (ordered by
		order of declaration in the recognized transition path). The semantic properties
		that are in rules referenced by another rule are child properties of the
		rule that made the reference. For example, examine the following grammar:
			<RULE NAME="A" TOPLEVEL="ACTIVE">
				<P PROPNAME="ROOT">
					<RULEREF NAME="B" PROPNAME="ROOT_SIBLING"/>
				</P>
			</RULE>
			<RULE NAME="B">
				<P PROPNAME="CHILD">hello</P>
				<P PROPNAME="LEAF">world</P>
			</RULE>
		The grammar contains two rules, one top-level rule which references another rule.
		The top-level rule contains two semantic properties, one attached to a phrase tag
		(e.g. "ROOT"), and the other attached to the rule reference tag (e.g.
		"ROOT_SIBLING"). The second rule also contains two semantic properties, one
		attached to a phrase tag (e.g. "CHILD), and the other attached to the phrase tag
		(e.g. "LEAF"). If the recognized phrase is "hello world", the semantic property
		structure is as follows:
			SPPHRASE->pProperties.pszName == "ROOT"
			SPPHRASE->pProperties->pNextSibling.pszName == "ROOT_SIBLING"
			SPPHRASE->pProperties->pFirstChild.pszName == "CHILD"
			SPPHRASE->pProperties->pFirstChild->pNextSibling.pszName == "LEAF"
		Note that no matter how many phrases or semantic properties are contained in a
		single RULE, all of the properties are siblings. Child semantic properties are only
		created by using rule references. See also the Whitepaper, Designing Grammar Rules:
		Retrieving Semantic Properties.

XML Grammar Sample(s):
	<GRAMMAR>
		<DEFINE>
			<ID NAME="RID_Numbers" VAL="1"/>
			<ID NAME="RID_AddNumbers" VAL="2"/>
			<ID NAME="PID_Value" VAL="1"/>
		</DEFINE>
		<!-- create a simple rule that reuses the local numbers rule component -->
		<RULE ID="RID_AddNumbers" TOPLEVEL="ACTIVE">
			<P>add</P>
			<!-- the first operand will be a number from the numbers rule-->
			<!-- the application can retrieve the child property of this property "operand_1"
				which has a value of 1-5 -->
			<RULEREF REFID="RID_Numbers" PROPNAME="operand_1"/>
			<P>to</P>
			<!-- the second operand will be a number from the numbers rule-->
			<!-- the application can retrieve the child property of this property "operand_2"
				which has a value of 1-5 -->
			<RULEREF REFID="RID_Numbers" PROPNAME="operand_2"/>
		</RULE>

		<!-- Note that rule is not top-level and is only used as a reusable component rule -->
		<RULE ID="RID_Numbers">
			<LIST PROPID="PID_Value">
				<P VAL="1">one</P>
				<P VAL="2">two</P>
				<P VAL="3">three</P>
				<P VAL="4">four</P>
				<P VAL="5">five</P>
			</LIST>
		</RULE>

		<RULE NAME="SearchWeb" TOPLEVEL="ACTIVE">
			<P>search web for site named</P>
			<!-- Reference a fictitious rule located on the web which contains a daily updated
				list of SR-friendly web site names -->
			<RULEREF NAME="SiteNames" URL="http://www.msn.com/WebServices/SpeechObjects.cfg"/>
		</RULE>

		<RULE NAME="SearchAddressBook" TOPLEVEL="ACTIVE">
			<P>find address of</P>
			<!-- Reference a fictitious rule located in a registered COM object, which contains
				a dynamic list of Exchange server address book names -->
			<RULEREF NAME="FullNames" OBJECT="Exchange.SpeechGrammars"/>
		</RULE>
	</GRAMMAR>

Programmatic Equivalent:
	Application developers can programmatically import rules from URLs by using the following format:
		Rule Name = "URL:" + FILENAME + "\\" RULENAME
	For example, to import a rule called "Numbers" from the file "A.cfg", use the following sample code:

		SPSTATEHANDLE hSpeakNumber;
		SPSTATEHANDLE hsBeforeImport;
		SPSTATEHANDLE hsRuleImport;
		// Create new rule called "SpeakNumber"
		hr = cpRecoGrammar->GetRule(L"SpeakNumber", NULL, NULL, TRUE, &hSpeakNumber);
		// Check hr

		// Create new state for the beginning text
		hr = cpRecoGrammar->CreateNewState(hSpeakNumber, &hsBeforeImport);
		// Check hr

		// add the beginning text "speak the number"
		hr = cpRecoGrammar->AddWordTransition(hSpeakNumber, hsBeforeImport, L"speak the number",
							L" ", SPWT_LEXICAL, 1.0f, NULL);
		// Check hr

		// Import the rule "Numbers" from A.cfg
		hr = cpRecoGrammar->GetRule(L"URL:file://A.cfg\\Numbers", 0, SPRAF_Import, TRUE, &hsRuleImport);
		// Check hr

		// reference the "Numbers" rule after the beginning text
		hr = cpRecoGrammar->AddRuleTransition(hsBeforeImport, NULL, hsRuleImport, 1, NULL);
		// Check hr

		hr = cpRecoGrammar->Commit(NULL);
		// Check hr

return to the top of this page Back to top

<TEXTBUFFER>

Summary: The TEXTBUFFER tag is used for applications needing to integrate a dynamic
text box or text selection with a voice command.

XML Attributes:
	PROPID (optional, type=VT_I4): Specifies the semantic property's numeric identifier.
	PROPNAME (optional): Specifies the semantic property's string identifier.
	WEIGHT (optional, type=VT_UI4,VT_I4,VT_R4,VT_R8, default=1/n_sibling_transitions): Specifies
		the probability of the TEXTBUFFER-based phrase being spoken by the user.

XML Parent Elements:
	LIST, L: List of phrases which can be recognized.
	PHRASE, P: Phrase that must be recognized for the containing rule to be recognized.
	OPT, O: Optional phrase that may be recognized.
	RULE: Rule that contains phrases or text to be recognized.

XML Child Elements:
	None

Detailed Description:
	The TEXTBUFFER tag is useful for applications that have a dynamic buffer of text,
		and want to allow the user to speak portions of the text. The most obvious
		example is likely the text selection user interface. The application offers
		a buffer of text, and allows the user to select any contiguous subset of
		the buffer. For example, when the text is "a b c d e", the user can select
		"a b c" and "c d e", but not "b e" since it is not a contiguous subset of
		the text buffer.
	The TEXTBUFFER tag allows the grammar author to define a command, and reference the
		dynamic text buffer which will be set and maintained at application run time.
		For example, the grammar might contain the command "select TEXTBUFFER_PORTION",
		which, when using the previous text sample, would allow the phrases "select a
		b c", "select "c d e", but not "select b e". The grammar author should focus
		her efforts on building commands to operate on the text buffer, while the
		application developer need only focus on maintaining the text buffer (see
		ISpRecoGrammar::SetWordSequenceData and ISpRecoGrammar::SetTextSelection) and
		responding to the TEXTBUFFER-based commands.
	The TEXTBUFFER has three main components, the complete text buffer, the text allowed
		text subsets in the buffer, and the active selection. The complete text buffer
		is a string of text characters, which is double-NULL terminated. The reason
		for using a double-NULL to allow for multiple exclusive subsets of the buffer
		to be active (e.g. each subset is a paragraph). The recognition engine will
		not recognize phrases which span the exclusive subsets (delimited by a single
		NULL character). The third component is the active selection, or current
		portion of the buffer that should be recognizable (e.g. the application can
		update the selection to include on the text visible on the screen, or only
		the text selected by the user). Note that any portion of the buffer that is
		not included in the TEXTBUFFER's active selection is not recognizable.
	The TEXTBUFFER tag is shared across all of the commands associated with a single
		grammar object. For applications that need to support multiple text buffers,
		the application has three options. If the text buffers use the same commands,
		but do not need to be active simultaneously, the application can use the active
		selection feature (of the TEXTBUFFER) to switch between buffers. If the text
		buffers are unique, but the buffers need to be active simultaneously, the
		application can use the single-NULL terminated subsets of the TEXTBUFFER
		(noting that each set is exclusive and non-contiguous). Finally, if the
		application has multiple text buffers, requires the buffers to be active
		simultaneously, and uses different commands for each buffer, the application
		can use a single grammar object for each buffer.
	The application should use semantic properties (see attributes PROPNAME and PROPID)
		to quickly and easily parse the TEXTBUFFER-related text out of the command.
		SAPI will automatically set the semantic property's phrase
		element range to match the elements taken from the TEXTBUFFER.
	The speech recognition engine must support text-buffers inside of a CFG for the
		grammar to load and activate successfully. The application can determine if
		an engine supports the TEXTBUFFER tag by retrieving the SR engine's object
		token (see ISpRecognizer::GetRecognizer), and then checking for the existence
		of the engine attribute "WordSequences" (see ISpObjectToken::MatchesAttributes).

XML Grammar Sample(s):
	<GRAMMAR>
		<!-- basic command to perform text selection -->
		<RULE ID="SelectText" TOPLEVEL="ACTIVE">
			<P>select the words</P>
			<TEXTBUFFER PROPID="PID_SelectedText"/>
		</RULE>
	</GRAMMAR>

Programmatic Equivalent:
	To programmatically create a text-buffer transition in a CFG, the application developer
		can use the ISpGrammarBuilder::AddRuleTransition with a special rule handle,
		called SPRULETRANS_TEXTBUFFER. For example, the following code creates a simple
		command called "SelectText" which recognizes the command "select TEXTBUFFER".

			SPSTATEHANDLE hsSelectText;
			// Create new top-level rule called "SelectText"
			hr = cpRecoGrammar->GetRule(L"SelectText", NULL,
							SPRAF_TopLevel | SPRAF_Active, TRUE,
							&hsSelectText);
			// Check hr

			// Create an interim state before the text-buffer transition
			SPSTATEHANDLE hsBeforeTextBuffer;
			hr = cpRecoGrammar->CreateNewState(hsPlayCard, &hsBeforeTextBuffer);
			// Check hr

			// Add the command word "select"
			hr = cpRecoGrammar->AddWordTransition(hsSelectText, hsBeforeTextBuffer,
					L"select", L" ", SPWT_LEXICAL, 1.0f, NULL);
			// Check hr

			// Add text-buffer transition
			hr = cpRecoGrammar->AddRuleTransition(hsBeforeTextBuffer, NULL,
								SPRULETRANS_TEXTBUFFER, 1.0f, NULL);
			// Check hr

			// save/commit changes
			hr = cpRecoGrammar->Commit(NULL);
			// Check hr

			// ... perform other processing/setup

			// Setup text-buffer

			// Place the contents of text buffer into pwszCoMem and
			//	the length of the text in cch
			SPTEXTSELECTIONINFO tsi;
			tsi.ulStartActiveOffset = 0;
			tsi.cchActiveChars = cch;
			tsi.ulStartSelection = 0;
			tsi.cchSelection = cch;
			pwszCoMem2 = (WCHAR *)CoTaskMemAlloc(sizeof(WCHAR) * (cch + 2));
			if (pwszCoMem2)
			{
				// SetWordSequenceData requires double NULL terminator.
				memcpy(pwszCoMem2, pwszCoMem, sizeof(WCHAR) * cch);
				pwszCoMem2[cch] = L'\0';
				pwszCoMem2[cch+1] = L'\0';

				// set the text buffer data
				hr = cpRecoGrammar->SetWordSequenceData(pwszCoMem2, cch + 2, NULL);
				// Check hr

				// set the text selection information independently
				hr = cpRecoGrammar->SetTextSelection(&tsi);
				// Check hr
				CoTaskMemFree(pwszCoMem2);
			}
			CoTaskMemFree(pwszCoMem);

			// the SR engine is now capable of recognizing the contents of the text buffer

return to the top of this page Back to top

<WILDCARD>

Summary: The WILDCARD tag is used in rules or phrases that need added robustness and
	flexibility for the speaker's phrasing.

XML Attributes:
	None

XML Parent Elements:
	LIST, L: List of phrases which can be recognized.
	PHRASE, P: Phrase that must be recognized for the containing rule to be recognized.
	OPT, O: Optional phrase that may be recognized.
	RULE: Rule that contains phrases or text to be recognized.

XML Element Children:
	None.

Detailed Description:
The WILDCARD tag is designed for applications that would like to recognize
some phrases without failing due to irrelevant, or ignorable words. For
example, an application may have a command with the phrase "save document".
Many users may trivially modify the phrase by saying "save my document",
"save the document", "save this document", etc.. With a pure CFG, the latter
phrases would all fail to be recognized due to the extra words. The grammar
author can add a wildcard, or garbage field, which will consume the extra
words, and allow the application to successfully handle all of the phrases.
In the aforementioned case, the grammar would need a wildcard before the word
"document".
The WILDCARD is different from DICTATION in that the application will never see the
recognized garbage words, even though they were recognized. Consequently, the
application and grammar author should not place wildcards in places which may
affect the intended user action (e.g. "cancel save" is not the same as "please
save".
The grammar author can also use a special character, ellipsis (...) instead of the entire
XML tag. See XML Grammar Format: Special Wildcard Tag.
The speech recognition engine must support wildcards inside of a CFG for the grammar
to load and activate successfully. The application can determine if an engine
supports the WILDCARD tag by retrieving the SR engine's object token (see
ISpRecognizer::GetRecognizer), and then checking for the existence of the
engine attribute "WildcardInCFG" (see ISpObjectToken::MatchesAttributes).
The engine can specify support for the WILDCARD tag to be anywhere in the
CFG phrase (attribute value="Anywhere"), or only at the end (attribute
value="Trailing").

XML Grammar Sample(s):
	<GRAMMAR>
		<!-- basic command to play the queen of hearts -->
		<RULE ID="PlayCard" TOPLEVEL="ACTIVE">
			<P>play <WILDCARD/> queen of hearts</P>
		</RULE>

		<!-- basic command to play the queen of hearts, using special ellipsis -->
		<RULE ID="PlayCard_Ellipsis" TOPLEVEL="ACTIVE">
			<P>play ... queen of hearts</P>
		</RULE>
	</GRAMMAR>

Programmatic Equivalent:
	To programmatically create a wildcard transition in a CFG, the application developer
		can use the ISpGrammarBuilder::AddRuleTransition with a special rule handle,
		called SPRULETRANS_WILDCARD. For example, the following code creates a simple
		command called "PlayCard" which recognizes the command "play WILDCARD queen of hearts".

			SPSTATEHANDLE hsPlayCard;
			// Create new top-level rule called "PlayCard"
			hr = cpRecoGrammar->GetRule(L"PlayCard", NULL,
							SPRAF_TopLevel | SPRAF_Active, TRUE,
							&hsPlayCard);
			// Check hr

			// Create an interim state before the wildcard transition
			SPSTATEHANDLE hsBeforeWildcard;
			hr = cpRecoGrammar->CreateNewState(hsPlayCard, &hsBeforeWildcard);
			// Check hr

			// Add the command word "play"
			hr = cpRecoGrammar->AddWordTransition(hsSendMail, hsBeforeWildcard,
					L"play", L" ", SPWT_LEXICAL, 1.0f, NULL);
			// Check hr

			// Create an interim state after the wildcard transition
			SPSTATEHANDLE hsAfterWildcard;
			hr = cpRecoGrammar->CreateNewState(hsPlayCard, &hsAfterWildcard);
			// Check hr

			// Add interim wildcard transition
			hr = cpRecoGrammar->AddRuleTransition(hsBeforeWildcard, hsAfterWildcard,
								SPRULETRANS_WILDCARD, NULL, NULL);
			// Check hr

			// Add the command words "queen of hearts"
			hr = cpRecoGrammar->AddWordTransition(hsAfterWildcard, NULL,
					L"queen of hearts", L" ", SPWT_LEXICAL, 1.0f, NULL);
			// Check hr

			// save/commit changes
			hr = cpRecoGrammar->Commit(NULL);
			// Check hr

	The previous sample code will support any of the following phrases:
		"play the queen of hearts"
		"play a queen of hearts"
		"play the left queen of hearts"
		etc.
	Note that the italicized words will be recognized by the speech recognition engine,
		but will not be returned to the application. The application should not put
		any application-logic sensitive inside of a wildcard, since the text is not
		returned.

return to the top of this page Back to top

previous page start next page