I mentioned before that Windows 10 came with two flavors of SAPI, 5.4 proper and OneCore, internally also marked as v.5.4. The System.Speech.Synthesis classes of the .NET library employ the classic SAPI, while the Windows.Media.SpeechSynthesis classes of the UWP library run on top of OneCore. You can see that by the contents of the voice Id values. Native applications that target SAPI directly, via COM, have the option of using either. Why was SAPI forked like that - I'm not sure.
The API surfaces of SAPI proper and SAPI OneCore are almost identical. Their classes have different CLSIDs, their typelibs have different LibIDs (but the same name, so you have two identical lines in the registered typelib list):
The set of supported interfaces is different; SAPI proper supports both custom interfaces (e. g. ISpVoice) and dual, IDispatch-based, Automation compatible ones (e. g. ISpeechVoice). Meanwhile, OneCore only has the custom ones. The IIDs are all identical. Lack of Automation support means you can't access OneCore from JavaScript/VBScript clients.
The discrepancy causes a beautiful disconnect in the Windows settings. If you go to Settings/Time and Language/Speech, you'd see three Japanese voices (Ayumi/Haruka/Ichiro). If you go to Control Panel/Speech Recognition/Text to Speech, you'd see only Haruka. The former is a UWP app, the latter is a desktop one.
On Windows 8.1 and before, there was no SAPI OneCore. The respective classes of the WinRT API (the predecessor to UWP) would work with SAPI proper instead. The same WinRT 8.1 application, if run on Windows 10, would use OneCore, though. There was no explicit way of installing the TTS voices on Windows 8.1, the way it is on Windows 10, but if you install a language via PC Settings/Region and Language, at least for Japanese, that would install the Haruka voice under SAPI as well.
The Windows 8.0 iteration of WinRT didn't have the Windows.Media.SpeechSynthesis namespace.
SAPI 5.4 proper would ship with the OS at least since Windows 7.
No comments:
Post a Comment