NaturalVoiceSAPIAdapter icon indicating copy to clipboard operation
NaturalVoiceSAPIAdapter copied to clipboard

does ssml language work for multilingual voices?

Open vortex1024 opened this issue 1 year ago • 2 comments

for the online multilingual voices, microsoft recommends using ssml to force the language when it is not detected right. does your project support that? I could not make this work, for example, with the brian multilingual voice selected in the tts application supplied in the packge, with process xml on: <lang xml:lang="ro-RO" ce zici de asasin?
it speaks French, not Romanian. I also tried enclosing the text in a element, setting its lang attribute, but no go. Thanks.

vortex1024 avatar May 13 '24 17:05 vortex1024

Unfortunately, Microsoft Edge online voices only support a very limited subset of SSML. <lang> tags are not supported.

Also, any unsupported SSML tag will make the server throw an "SSML is invalid" error and close the connection. So this engine has to filter out all SSML tags except a few supported ones, such as <prosody>, before sending the SSML to the Edge voice server.

The Edge voice server requires an xml:lang attribute on the root <speak> element. But changing it seems to do nothing.

So no, changing the language is not supported by Edge voices.


But if you have an Azure Speech subscription key, you can use the Azure voices, which supports that feature.

Currently this engine does not enumerate Azure voices, so if you want to use an Azure voice, you will have to add it manually to the registry.

In registry editor, create a registry key under HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\Voices\Tokens. Then, create the following keys & values inside this key:

  • String value (Default): display name of the voice, e.g. Microsoft BrianMultilingual
  • String value CLSID: {013ab33b-ad1a-401c-8bee-f6e2b046a94e}
  • Subkey: Attributes
    • String value Language: hexadecimal language ID of the voice, e.g. 409 for English (US)
  • Subkey: NaturalVoiceConfig
    • String value Region: service region, e.g. japaneast
    • String value Key: your subscription key
    • String value Voice: voice name, e.g. en-US-BrianMultilingualNeural

Check this for a list of Edge online voice names ("ShortName").

gexgd0419 avatar May 14 '24 15:05 gexgd0419

thanks for the detailed explanation. it is a shame this does not work. the only way I imagine it could be made to work for free is always passing in some unique string to that language, and then cutting the correspondent audio from the resulting wav

vortex1024 avatar May 24 '24 21:05 vortex1024