CoffeeS6

Microsoft Speech SDK

The Microsoft.com Speech website Microsoft Speech SDK SAPI 5.1

CoffeeS6

Introduction

CoffeeS6 is the seventh and final sample application in a tutorial series named Coffee. It uses a consistent coffee shop motif. Customers enter the shop, go to the service counter, speak to order drinks or to enter the front office.

The samples are intended to demonstrate speech recognition capabilities within an application. They are designed for the application-level (API) programmer and for those not familiar with speech technology. Each sample will progressively add new features and increase in complexity. The tutorial chapters explain in detail particulars of the code. You are encouraged to read each chapter. Writing engines such as speech recognition or text-to-speech, also called device driver programming, will be covered separately. The samples can use engines provided by the SAPI SDK or third party SAPI-compliant engines.

Using CoffeeS6

CoffeeS6 expands the concepts of grammars and dictations as they have been presented in the previous examples. The early Coffee examples used a fixed grammar to include a select set of words. CoffeeS5 introduced dynamic grammars so you could add new words to an existing word list. CoffeeS6 goes one additional step and uses dictation. For dictation, you are no longer limited to an explicit list but may now use almost any word or words. However, instead of demonstrating this as a free-formed dictation application, CoffeeS6 uses it in association with existing grammars.

To showcase this ability, CoffeeS6 enables you to rename the coffee shop. From the office, a new option is presented: Manage Store Name. Speak this command and the screen changes again. A new order is displayed on the screen allowing you to rename the shop. Say "Rename the coffee shop to" and speak a name. CoffeeS6 then speaks the new name of the store. For example, by saying, "Rename the coffee shop to My Coffee Emporium," the new name will be "My Coffee Emporium." The name will also be displayed throughout CoffeeS6. After saying the command, you may then navigate to another location or continue to rename the store any number of times.

New Commands List

Choosing one word from each line of a category forms the command. Commands in parenthesis are optional and do need to be included. Words or phrases separated by slashes indicate any of the choices listed may be used although only one may be selected. Sections marked RULEREF indicate words or phrases may be chosen from the corresponding rule ID. Rule names are the same as those listed in the corresponding XML configuration file.


The following rule is used to rename the coffee shop. However, because the shop name is not limited to a particular element from list, the asterisk acts as a wildcard. Any word or words are permitted. Additionally, the plus sign forces a greater level of confidence. See VID_ThingsToManage for a more detailed explanation of the sign. Requiring greater confidence forces the speech recognition engine to spend additional time processing the word. Greater confidence results in either better recognition of the word, or a higher confidence of the word returned by the speech recognition engine.

XML rule ID: VID_Rename

  • Rename the coffee shop to *+

The last two items of the next rule are new to CoffeeS6. As expected, you can say "shop" and "store." However, the command allows for a change of emphasis on words. The plus or minus sign changes the required confidences of the word recognition. By increasing the required confidence level, the speech recognizer demands a higher quality of the word being recognized. In a similar way, decreasing the confidence allows for greater latitude of the word's recognized quality. This way, certain words can be emphasized. For instance, the following rule ensures that the word "name" is recognized. Because regional accents or background noise may detract from the quality of the word, the new emphasis makes certain that the sound heard was actually "name." In turn, the rule places less emphasis on the word "shop" or "store." Although these words are still required, it is not as important to be certain.

XML rule ID: VID_ThingsToManage

  • employees
  • -shop +name
  • -store +name

Additionally, two existing rules have new list elements. The ellipsis disregards any words spoken for the current element. For example, using VID_EspressoDrinks you can say "May I have a coffee," or "Get me coffee," as expected. However, by including the ellipsis in the rule, you may also say "I dunno, how about a coffee." The statement will be recognized since everything before the drink name can essentially be ignored. Including the ellipsis as a list element significantly reduces the constraint of listing all words individually.

XML rule ID: VID_EspressoDrinks

XML rule ID: VID_OrderList

  • ...