UPS provides unique phone labels for common compound sounds such as affricates and nasal vowels. However, less frequent compound sounds may also exist in some languages and need to be represented in SAPI lexicons. The Universal Phone Set provides a compounding or tying symbol "+" that can be used to describe composite sounds from any two phones.
The German affricate "PF", diphthong "AI", and nasal vowel "AN" exist as unique phones in UPS. However, they could also be represented as compounds. The following example shows that "PF" and "P + F" are equivalent and are distinct from the phone sequence "P F".
Unique Phone | Compound |
---|---|
PF | P + F |
AI | A + I |
The nasal vowel "AN" can be described as the "A" vowel with a nasal diacritic. All diacritics must follow a segmental phone, so "A nas" and "A + nas" are equivalent. The former is preferred, though SAPI does not enforce any particular use of "+" or diacritics.
Unique Phone | Compound |
---|---|
AN | A nas |
AN | A + nas |
Speech Recognition (SR) engines must interpret compound sounds found in SAPI phone strings and map them to their internal phone set. For example, one SR engine may have an acoustic model "pf". In this case, the SAPI phone string "P + F" or "PF" would be mapped to the SR phone "pf". Another engine may not model "pf", instead modeling only "p" and "f". In this case either SAPI phone string (unique or compound) would be split into its component parts "p" and "f" by the SR engine.
Such mappings are provided by the SR engine's Phone Converter object within the engine's Locale Handler. It is up to the engine developer to write this mapping code and to parse any incoming SAPI phone strings. See Parsing Guidelines for SAPI Speech Recognition Phone Converters (Microsoft.Speech) for more information.