Microsoft Speech SDK Setup 5.1
Introduction
The Microsoft Speech SDK setup is built on the Windows Installer technology. The SAPI 5.1 core components are only available for distribution through Windows Installer merge modules and are called .msm files. These .msm files should be included in an applications setup and packaged in a Microsoft Installer .msi file. The .msi files consume the merge modules and handle the actual installation process. All other application-specific installation files should be included in this module. The .msi is run by the setup.exe file. The setup.exe file determines if the Windows Installer is present on your system and installs it if necessary. SAPI 5.1 is redistributable by independent software vendors (ISVs) or individuals by including the Speech SDK MSMs in their setup process and using the Windows Installer merge module technologies.
The following topics are covered in this section:
- SAPI 4.0 and SAPI 5.0 coexistence
- Speech SDK merge modules (MSM)
- Speech SDK core modules
- Speech recognition (SR) modules
- Text-to-speech (TTS) modules
- Speech SDK modules
- Speech SDK file locations
- Speech SDK files
- Quiet Install
- Registry settings
- Building your setup package
- Glossary of terms and abbreviations
SAPI 4.0 and SAPI 5.1 coexistence
In order to ensure that SAPI 4.0 and SAPI 5.1 can coexist on a single machine, the following four steps have been taken:
- SAPI 5.1 dlls have different names from SAPI 4.0 dlls
- SAPI 5.1 registry keys are registered in different locations
- SAPI 5.1 GUIDs are different from SAPI 4.0 GUIDs
- SAPI 5.1 registry keys are different from SAPI 4.0 registry keys
There is one known issue: SAPI 4.0 and SAPI 5.1 cannot share a common microphone. If one version of SAPI has control of the microphone, the other cannot use it. However, the functionality of TTS is not affected.
The engines that will be shipped with the Microsoft Speech SDK are shown below:
Engine | Language |
---|---|
SR | English |
Japanese | |
Chinese | |
TTS | English |
Chinese |
Speech SDK merge modules (MSM)
The Speech SDK setup will produce merge modules for use with MSI to support the following configurations:
- Speech SDK core modules
- Speech Recognition (SR) modules
- Text to Speech (TTS) modules
- Speech SDK modules
Speech SDK core modules
- sp5.msm
- SAPI 5.1 includes the following: SAPI.DLL ,SAPISVR.exe, sapi.cpl, sapi.cpl help files: This is made available as a redistributable component for ISVs, both application vendors and engine vendors. The ISV will be expected to install their engines in the proper location and make the proper registry settings for the engines. For more information, please see the Speech SDK file locations section below.
- Sp5intl.msm
- These modules contain localized resource .dlls needed for the Control Panel, sapi.cpl. (Sapi.cpl contains English resources so that if none of the sp5intl.msm's are installed it would default to English). The Control Panel is the GUI used to select the various speech engines and for setting a voice enabled application. The Control Panel has been localized for other languages. Currently, there are three separate language dependent modules:
- English
- Japanese
- Chinese (simplified)
If the number of localized languages increases, separate msms for each language will be made available. It is recommended to include all localized msm's in 3rd party Setups so that if the user wants to install on Japanese or Chinese systems will be correctly localized.
This module has a dependency on sp5.msm.
Speech Recognition (SR) modules
- Sp5sr.msm
- The Microsoft SAPI 5.1 SR engine files are contained within this module (spsreng.dll and spsrx.dll). There are no speech data files located within this module, as the Microsoft SR engine is language independent. This implies that one engine can perform SR processing for multiple languages by loading different speech data files.
This module has a dependency on sp5.msm and sp5intl.msm
- Sp5itn.msm
- The language specific SAPI 5.1 ITN components are located within this file. The ITN modules enable developers to include Inverse Text Normalization in the SR applications. There are three separate language dependent modules:
- English
- Japanese
- Chinese (simplified)
This module has a dependency on sp5sr.msm, sp5intl.msm and sp5.msm.
- Sp5ccint.msm
- All the acoustic and language modules of the Microsoft SAPI 5.1 SR engine are contained in this merge module. This module also contains localized
resource .dlls for the Microsoft SR engine (spsrx.dll). These resource .dlls contain User Interfaces for the Training wizard and Microphone wizard. (Spsrx.dll contains English resources. If no sp5ccint.msm's are installed, it would default to English). There are three separate language dependent modules:
- English
- Japanese
- Chinese (simplified)
This module has a dependency on sp5sr.msm, sp5intl.msm and sp5.msm.
Text to Speech (TTS) modules
- Sp5ttint.msm
- The Microsoft SAPI 5.1 TTS English engine is a language dependent TTS engine. This implies that the TTS engine module will contain the engine as well as the data files for the engine. Currently, Microsoft is shipping the following language TTS engines:
- English
- Chinese (simplified)
This module has a dependency on sp5.msm and sp5intl.msm.
- spcommon.msm
- Contains files that are common to both the Microsoft SAPI 5.1 TTS and SR engine. Currently, this is shipped for the following languages:
- English
Microsoft Speech SDK modules
- Sp5sdk.msi
- The full installation of the Microsoft Speech SDK includes the following modules:
- sp5.msm
- sp5intl.msm
- sp5sr.msm
- sp5ccint.msm
- Sp5ttint.msm
- spcommon.msm
The Speech SDK samples and help documentation for SAPI 5.1 API/DDI will be included in the installation.
Speech SDK file locations
Setup will verify the versions of the various installed components. Setup will detect if the operating system one of the following:
Supported Operating Systems
- Microsoft Windows(r) NT Workstation 4.0, service pack 6a, English, Japanese or Simplified Chinese edition.
- Microsoft Windows 2000 Professional Workstation, English edition or English edition with Japanese or Simplified Chinese Language support.
- Microsoft Windows 98. However, Windows 95 is not supported.
- Microsoft Windows Millennium edition.
When attempting to install the Microsoft Speech SDK on a non-supported operating system, a dialog box will appear with the string "SAPI5 is currently not supported on this Operating System. You must upgrade to Windows 98 or higher." After the Speech SDK has been installed on your computer, the footprint will not be deleted from the drive. If you accidentally delete the sapi.dll and then tries to run one of the applications, then the footprint file will be able to get the sapi.dll file and install it. All engine files should follow the 8.3 naming convention.
NOTE: To obtain the Chinese (Simplified) and Japanese Microsoft SR Engines and the Chinese (SImplified) TTS engine, please install the Microsoft Speech SDK 5.1 Language Pack.
- SAPI.DLL
- The sapi.dll file is the main dll for SAPI 5.1. This file should be independent of engine vendor and language. As a result, this file should be located in the system directory.
- Control Panel
- The Control Panel file is the file for the Control Panel. This file should be independent of engine vendor and language. As a result, this file should be located in the system directory.
- Lexicons
- The SAPI 5.1 user lexicons should be placed in the user's profile
directory under the speech directory.
For example, for Windows 2000 installations, this would be Documents and Settings\<user name>\Speech. - SR Engine
- The SR engines may be installed on any drive on a user's computer. The
SR engine is language independent (i.e., the same engine is loaded with
different data to create a different language engine). As a result, the SR
engine files, by default, should be located in a language independent path, as
follows:
Microsoft SR Engine
The SR engines may be installed on any drive on a user's computer. The SR engine is language independent (i.e., the same engine is loaded with different data to create a different language engine). As a result, the SR engine files by default should be located in a language independent path, as follows: program files\common files\speechengines\Microsoft\sr
The SR data files contain the language specific information. The SR data files (including the files needed for command and control (C and C) and ITN) should be located in the following path: program files\common files\speechengines\Microsoft\sr\<LCID> where the <LCID> is 1033 (English), 2052 (Chinese (simplified)) and 1041 (Japanese)
- TTS Engine
Microsoft TTS Engine
- The TTS engines may be installed on any drive on a user's computer. The Microsoft TTS engine is not language independent (i.e., currently, each language is based on a different TTS code base). As a result, the TTS engine by default should be placed under the LCID it represents. The TTS engine files should be located in the following path: program files\common files\speechengines\Microsoft\TTS\<LCID> where the <LCID> is 1033 (English), 2052 (Chinese (simplified)) and 1041 (Japanese)
Speech SDK files
The Microsoft Speech SDK contains a number of samples and tools. These samples should be located in the following directories:
- Executable Files
- All compiled executable files of the samples and the grammar compiler are located in the following directory: \Microsoft Speech SDK 5.1\bin
- Help Documentation
- The reference file SAPI5SDK.chm is located in: \Microsoft Speech SDK 5.1\docs\help
- SAPI 5.1 IDL
- The sapi.idl contains all of the API function declarations in SAPI 5.1. This is the main file used by application developers when developing speech enabled applications. \Microsoft Speech SDK 5.1\idl
- SAPI5ddk idl
- The sapi5ddk.idl contains the DDI function declarations for SAPI. This is the main file used by engine developers when developing speech engines. \Microsoft Speech SDK 5.1\idl
- Header files
- The header files for SAPI 5.1 should be located in: \Microsoft Speech SDK 5.1\include
- Miscellaneous
- The following folder contains the sapi.lib \Microsoft Speech SDK 5.1\lib\i386
- Samples
- The following table outlines the location of the source code for the various samples application and tools.
Name Path Dictation Pad \Microsoft Speech SDK 5.1\samples\cpp\dictpad Simple Dictation \Microsoft Speech SDK 5.1\samples\cpp\simpledict TTSApp \Microsoft Speech SDK 5.1\samples\cpp\TTSApp Tutorial - Coffee S0 \Microsoft Speech SDK 5.1\samples\cpp\tutorials\CoffeeS0 Tutorial - Coffee S1 \Microsoft Speech SDK 5.1\samples\cpp\tutorials\CoffeeS1 Tutorial - Coffee S2 \Microsoft Speech SDK 5.1\samples\cpp\tutorials\CoffeeS2 Tutorial - Coffee S3 \Microsoft Speech SDK 5.1\samples\cpp\tutorials\CoffeeS3 Tutorial - Coffee S4 \Microsoft Speech SDK 5.1\samples\cpp\tutorials\CoffeeS4 Tutorial - Coffee S5 \Microsoft Speech SDK 5.1\samples\cpp\tutorials\CoffeeS5 Tutorial - Coffee S6 \Microsoft Speech SDK 5.1\samples\cpp\tutorials\CoffeeS6 Talkback \Microsoft Speech SDK 5.1\samples\cpp\talkback Telephony Application \Microsoft Speech SDK 5.1\samples\cpp\telephony SR Engine (Null engine) \Microsoft Speech SDK 5.1\samples\cpp\engines\SR TTS engine (Null engine) \Microsoft Speech SDK 5.1\samples\cpp\engines\TTS SPComp \Microsoft Speech SDK 5.1\bin SRComp \Microsoft Speech SDK 5.1\tools\comp\SR TTSComp \Microsoft Speech SDK 5.1\tools\comp\TTS Grammar Editor \Microsoft Speech SDK 5.1\bin SimpleAudioDll \Microsoft Speech SDK 5.1\Samples\CPP\SimpleAudioDll TapiCustomStream \Microsoft Speech SDK 5.1\Samples\CPP\TapiCustomStream ListBoxCSharp \Microsoft Speech SDK 5.1\Samples\CSharp\Listbox SimpleTTSCSharp \Microsoft Speech SDK 5.1\Samples\CSharp\SimpleTTS SimpleTTSJScript \Microsoft Speech SDK 5.1\Samples\Scripts\SimpleTTS AudioApp \Microsoft Speech SDK 5.1\Samples\VB\AudioApp ListboxVB \Microsoft Speech SDK 5.1\Samples\VB\ListboxVB RecoVB \Microsoft Speech SDK 5.1\Samples\VB\RecoVB SimpleDictVB \Microsoft Speech SDK 5.1\Samples\VB\SimpleDict SimpleTTSVB \Microsoft Speech SDK 5.1\Samples\VB\SimpleTTS TTSAppVB \Microsoft Speech SDK 5.1\Samples\VB\TTSAppVB VBTapiSample \Microsoft Speech SDK 5.1\Samples\VB\VBTAPISamples Mkvoice \Microsoft Speech SDK 5.1\samples\cpp\engines\TTS\mkVoice Quiet Install
The command used for quite install is "msiexec /i "Microsoft Speech SDK 5.1.msi" /qn". "setup.exe /S /v/qn" is also another usable option.
Registry Settings"
For engine specific registry settings, please see the Object Tokens and Registry Settings white paper. All registry keys will be manually created and deleted upon installation and uninstallation respectively. This means that nothing in the setup procedure that will use self registration. Microsoft will not handle the lazy initialization for the user profiles.
Setup is not able to know about the individuals who will be using speech features after installation. The lazy registration information will continue to be built up by sapi.dll at run time.
Building your Setup Package
The easiest way to build a Setup package that incorporates the SAPI 5.1 redistributable merge modules is to use a Setup tool that is designed specifically for the Windows Installer technology. Currently a number of these exist on the market including, but not limited to: Install Shield for Windows Installer, Visual Studio Installer, Wise for Windows Installer, and Seagate WinINSTALL. These tools can consume the SAPI 5.1 merge modules (.msm's) and seamlessly install the components along with the rest your setup. Simply consult the documentation on these products or follow the built in Wizards to build your Setup package and include the SAPI 5.1 .msm's.
If these tools are not available, consider downloading the Windows Installer SDK and building your Setup package manually. The following steps provide a walk through on how this is done.
- Download the Windows Installer SDK from Windows Installer 1.5.
- Plan the Sample Installation. When the installation of an existing
application is moved to the Windows Installer from another setup technology,
the setup developer may start authoring a Windows Installer package using
the source and target file images of the existing installation. A detailed
plan of how the files and other resources are organized at the source and
target is also a good starting point for developing a package for a new
application.
For example, if you have a TTS application (YourTTSApp.exe) that you want to install along with the SAPI 5.1 merge modules, simply determine the source and destination locations of your application.
File Path To Source Path To Target YourTTSApp.exe C:\YourApp\YourTTSApp.exe D:\Program Files\CompanyName\YourTTSApp.exe
- Obtain the blank installation database Schema.msi from the Windows Installer SDK and rename it to yourProduct.msi.
- Use the database editor Orca, which is provided with the SDK, or another editor, to open the installation database yourProduct.msi.
- Use the editor to modify the following tables in the .msi:
Directory Table
Component Table
File Table
Media Table
Feature Table
Feature Components Table
Registry Table
ShortCut Table
Icon Table
Property Table
InstallExecuteSequence Table
InstallUISequence Table
AdminExecuteSequence Table
AdminUISequence Table
AdvtExecuteSequence Table
For details on specific table values and entries, see the "Windows Installer Examples / An installation example" section of the Msi.chm help file that is included in the Windows Installer SDK.
- Use the MsiInfo.exe tool provided with the Windows Installer SDK to add
Summary Information to yourProduct.msi. The following properties must be set
for your product to pass Package Validation. It is recommended that authors
run validation on every new, or newly modified, installation package before
attempting to install the package (see the Package Validation section of the
Windows Installer SDK documentation for more about Package Validation).
Summary Information Property Data Notes Template (Platform and Language) ;1033 Platform and language used by the database. Leaving the platform field empty indicates the package is platform independent. The ProductLanguage property from the database is typically used for this summary property. The sample's Language ID indicates that the package uses U.S. English. Revision Number (Package Code) {49D185A1-D7FD-11D2-9159-00C04FD70856} This is the package code GUID that uniquely identifies the sample package. If you reproduce this sample, use a utility such as GUIDGEN to generate a different GUID for your package. The results of GUIDGEN contain lowercase characters, note that you must change all lowercase characters to uppercase for a valid package code. See Package and Product Codes. Page Count (Minimum Installer Version) 100 For Windows Installer version 1.0, this property should be set to the integer 100. Word Count (Type of Source) ;1033 Platform and language used by the database. Leaving the platform field empty indicates the package is platform independent. The ProductLanguage property from the database is typically used for this summary property. The sample's Language ID indicates that the package uses U.S. English.
The remaining summary information stream properties are not required, but should be set for yourProduct.msi.
Summary Information Property Data Notes Title Installation Database Informs users that this database is for an installation rather than a transform or a patch. Subject yourProduct File browsers can display this as the product to be installed with this database. Keywords Installer, MSI, Database File browsers that are capable of keyword searching can search for these words. Author Your Company Name Name of the product's manufacturer. Comments This installer database contains the logic and data required to install YourProduct. Informs the user about the purpose of this database. Creating Application Orca Application used to create the installation database. Security 0 The sample database is unrestricted read-write.
To use MsiInfo to add the summary information to the sample, change to the directory containing the database yourProduct.msi and use the following command line:
MsiInfo.exe yourProduct.msi -T "Installation Database" -J Subject -A "Your Company Name" -K "Installer, MSI, Database" -O "This installer database contains the logic and data required to install YourProduct." -P ;1033 -V {49D185A1-D7FD-11D2-9159-00C04FD70856} -G 100 -W 0 -N Orca -U 0
- Add the following User Interface information to the Property Table:
Property Value DefaultUIFont DlgFont8 INSTALLLEVEL 3 LIMITUI 1 Manufacturer Your Company Name ProductCode {19BED231-30AB-11D3-91D3-00C04FD70856} ProductLanguage 1033 ProductName yourProduct ProductVersion 01.20.0000t UpgradeCode {ACFBE060-33B8-11D3-91D6-00C04FD70856}
- Validate your installation. See the Package Validation section of the Windows Installer SDK documentation for more about Package Validation.
Glossary of Terms and Abbreviations">
- API
- Application programming interface, the "top" side of the SAPI 5.1 middleware.
- C and C
- Command and control
- CSR
- Continuous speech recognition, also called dictation
- DDI
- Device Driver Interface. In SAPI 5.1, this is the interface speech engine providers code to (the underside of the middleware)
- MSI
- Microsoft Installer file containing the instructions and data required to install an application.
- MSM
- Microsoft Merge Module
- SAPI
- Microsoft Speech Application Programming Interface. SAPI 5.1 is middleware that provides an API for applications and a DDI for speech providers.
- SR
- Speech Recognition (includes both CSR and C and C)
- TTS
- Text-to-Speech, also called speech synthesis