Before you begin creating grammar files, you will need to envision and define how users will interact with your application by speaking. For example you may want to consider the following factors:
Which application contexts are good candidates for voice interaction?
What actions or application behavior can a user initiate using her voice?
What words can a user say to initiate each action or behavior?
How will users know when they can speak to the application and what they can say?
Having made basic design decisions, you can begin to define the speech vocabulary for your application. Determine the words that a user of your application is likely to say.
Make a list of all the actions that a user can initiate in your application by speaking.
For each action, list the words or phrases that a user can say to initiate the action.
For example, if your application plays music files, it will likely include actions to begin, pause, and stop playback of a music file. Include these actions in your list and define the words that users can say to initiate each action, as suggested in the following table.
Action | Spoken input |
---|---|
Begin playback | Play, play the song, play the tune, play the track, begin, begin playback, go, start |
Pause playback | Pause, pause the song, pause the tune, pause the track, pause playback, suspend, wait, hold |
Stop playback | Stop, stop the song, stop the tune, stop playback, end playback, quit |
See the next topic, Create an XML Grammar Structure (Microsoft.Speech), to learn how to create the structure that will contain the list of actions and spoken input that you compiled.