BotFramework-Emulator
BotFramework-Emulator copied to clipboard
Change default voice font for the emulator
Background
Is your feature request related to a problem? Please describe. I'd like to be able to test my bots with voice, the current monotone font used by the emulator is very inexpresive and boring... (it matters if you use it all day :))
Describe the solution you'd like Would it be possible to change the default font to use Jessa or Guy?, having the ability to change the font in settings would be a great plus too.
Describe alternatives you've considered For a better voice experience, I use webchat that allows me to configure the font, but I need to use ngrok or some other tool to run against localhost.
Note to fixer
- [x] Add the custom voice font feature in BotFramework-WebChat
- [x] Publish a new Web Chat release (prod or dev)
- [x] Transfer the issue back to BotFramework-Emulator
- [x] Bump Web Chat to version that support custom voice font
- [ ] Add UI and entry points
- Consider implementing UI (in Emulator app) for other speech-related features
- [x] Custom speech (CRIS)
- [x] Provide speech subscription key
UI to support custom speech
Custom speech is a custom speech recognition model, for recognizing words that is not in the dictionary.
The user would need to specify the followings:
- Subscription key (or authorization token, which is time-limited)
- Endpoint ID for custom speech (not endpoint URL)
UI to support custom voice font
Custom voice font is a custom speech synthesis model, for synthesizing using a pre-trained voice.
The user would need to specify the followings:
- Subscription key (or authorization, which is time-limited)
- Deployment ID (not deployment/endpoint URL)
- Voice name (this cannot be populated thru REST APIs to Cognitive Services)
References
- Custom voice font support is done by changing the URL endpoint, documented here, https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/rest-apis#text-to-speech-api.
transferring this from emulator repo since this requires work in webchat. @seaen, @compulim please triage.
Note to fixer
Custom voice font support is done by changing the URL endpoint, documented here, https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/rest-apis#text-to-speech-api.
Hi, I know that this can be done in webchat, my ask is for the BF Emulator.
Thanks
@gabog Understand this is an ask for Emulator. We are trying out GitHub new issue transfer to get the work done.
The issue is being transferred to BotFramework-WebChat, get it fixed here, and then will transfer back to BotFramework-Emulator for the UI part.
We need a better diagram/dependency graph or to-do list so the issue author (or participants) know what's going on.
Ah, OK, thanks.
(Edited by @compulim to remove email headers)
@vishwacsena can we get your input on a priority level here?
Hi, any updates on this one?
Thanks
Web Chat now support Custom Speech and Custom Voice. I have updated the first comment for UI requirements (what things we need from the user to enable these features).
For code, please refer to https://github.com/compulim/web-speech-cognitive-services#custom-speech-support.
@tonyanziano when we implement this, you can refer to BotFramework-WebChat/SPEECH.md for how to do all the jobs. I am outlining here to simplify your reading task. 😉
- Basic
- Input Cognitive Services subscription key
- Input Cognitive Services region (if region is wrong, subscription key will say 401 error)
- On save, if you want to test if the subscription key/region works, test by getting an auth token, look at this article
- Selecting standard voices (not Custom Voice)
- By default, we have a pretty good strategy on selecting a suitable voice, but it seems default to male due to sorting
- To list voices and make this a combo box, you will need to play with Cognitive Services REST API, copy some code from here
- Not all voices are compatible with all languages. If the bot is sending Chinese text, and the developer selected English voice, it will be HTTP 400. But it works the other way (English text synthesized using Cantonese/Mandarin voice).
- Custom Speech
- Ability to recognize trademark names
- Input endpoint ID
- To test: create a speech model, or ask me for one
- Custom Voice
- Ability to synthesize using your unique voice
- Input deployment ID
- Input voice model name
- Voice model can be enumerated using REST API, look at this Swagger API
- To test: create a voice model or ask me for one
- Text normalization options
- A speech recognition format, think about "Two 4 piece chicken nuggets." vs. "2 4 piece chicken nuggets"
- Select between:
- Display (add punctuations, capitalization, etc)
- Inverse Text Normalization
- Masked Inverse Text Normalization
- Lexical
AFAIK, all of the above are asks from our customers.
Any movement on this issue?
Hi @jbgh2 ,
We have not made any progress on implementing this feature. However, we will soon be holding our planning discussion to decide what to work on for the next release, so stay tuned.