BotFramework-Emulator icon indicating copy to clipboard operation
BotFramework-Emulator copied to clipboard

Change default voice font for the emulator

Open gabog opened this issue 7 years ago • 11 comments
trafficstars

Background

Is your feature request related to a problem? Please describe. I'd like to be able to test my bots with voice, the current monotone font used by the emulator is very inexpresive and boring... (it matters if you use it all day :))

Describe the solution you'd like Would it be possible to change the default font to use Jessa or Guy?, having the ability to change the font in settings would be a great plus too.

Describe alternatives you've considered For a better voice experience, I use webchat that allows me to configure the font, but I need to use ngrok or some other tool to run against localhost.

Note to fixer

  • [x] Add the custom voice font feature in BotFramework-WebChat
  • [x] Publish a new Web Chat release (prod or dev)
  • [x] Transfer the issue back to BotFramework-Emulator
  • [x] Bump Web Chat to version that support custom voice font
  • [ ] Add UI and entry points
  • Consider implementing UI (in Emulator app) for other speech-related features
    • [x] Custom speech (CRIS)
    • [x] Provide speech subscription key

UI to support custom speech

Custom speech is a custom speech recognition model, for recognizing words that is not in the dictionary.

The user would need to specify the followings:

  • Subscription key (or authorization token, which is time-limited)
  • Endpoint ID for custom speech (not endpoint URL)

UI to support custom voice font

Custom voice font is a custom speech synthesis model, for synthesizing using a pre-trained voice.

The user would need to specify the followings:

  • Subscription key (or authorization, which is time-limited)
  • Deployment ID (not deployment/endpoint URL)
  • Voice name (this cannot be populated thru REST APIs to Cognitive Services)

References

  • Custom voice font support is done by changing the URL endpoint, documented here, https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/rest-apis#text-to-speech-api.

gabog avatar Nov 21 '18 00:11 gabog

transferring this from emulator repo since this requires work in webchat. @seaen, @compulim please triage.

vishwacsena avatar Nov 26 '18 19:11 vishwacsena

Note to fixer

Custom voice font support is done by changing the URL endpoint, documented here, https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/rest-apis#text-to-speech-api.

compulim avatar Nov 26 '18 19:11 compulim

Hi, I know that this can be done in webchat, my ask is for the BF Emulator.

Thanks

gabog avatar Nov 26 '18 23:11 gabog

@gabog Understand this is an ask for Emulator. We are trying out GitHub new issue transfer to get the work done.

The issue is being transferred to BotFramework-WebChat, get it fixed here, and then will transfer back to BotFramework-Emulator for the UI part.

We need a better diagram/dependency graph or to-do list so the issue author (or participants) know what's going on.

compulim avatar Nov 26 '18 23:11 compulim

Ah, OK, thanks.

(Edited by @compulim to remove email headers)

gabog avatar Nov 26 '18 23:11 gabog

@vishwacsena can we get your input on a priority level here?

corinagum avatar Nov 29 '18 00:11 corinagum

Hi, any updates on this one?

Thanks

gabog avatar Feb 14 '19 05:02 gabog

Web Chat now support Custom Speech and Custom Voice. I have updated the first comment for UI requirements (what things we need from the user to enable these features).

For code, please refer to https://github.com/compulim/web-speech-cognitive-services#custom-speech-support.

compulim avatar Aug 05 '19 22:08 compulim

@tonyanziano when we implement this, you can refer to BotFramework-WebChat/SPEECH.md for how to do all the jobs. I am outlining here to simplify your reading task. 😉

  • Basic
    • Input Cognitive Services subscription key
    • Input Cognitive Services region (if region is wrong, subscription key will say 401 error)
    • On save, if you want to test if the subscription key/region works, test by getting an auth token, look at this article
    • Selecting standard voices (not Custom Voice)
      • By default, we have a pretty good strategy on selecting a suitable voice, but it seems default to male due to sorting
      • To list voices and make this a combo box, you will need to play with Cognitive Services REST API, copy some code from here
      • Not all voices are compatible with all languages. If the bot is sending Chinese text, and the developer selected English voice, it will be HTTP 400. But it works the other way (English text synthesized using Cantonese/Mandarin voice).
  • Custom Speech
    • Ability to recognize trademark names
    • Input endpoint ID
    • To test: create a speech model, or ask me for one
  • Custom Voice
    • Ability to synthesize using your unique voice
    • Input deployment ID
    • Input voice model name
    • To test: create a voice model or ask me for one
  • Text normalization options
    • A speech recognition format, think about "Two 4 piece chicken nuggets." vs. "2 4 piece chicken nuggets"
    • Select between:
      • Display (add punctuations, capitalization, etc)
      • Inverse Text Normalization
      • Masked Inverse Text Normalization
      • Lexical

AFAIK, all of the above are asks from our customers.

compulim avatar Aug 22 '19 09:08 compulim

Any movement on this issue?

jbgh2 avatar Mar 11 '20 05:03 jbgh2

Hi @jbgh2 ,

We have not made any progress on implementing this feature. However, we will soon be holding our planning discussion to decide what to work on for the next release, so stay tuned.

tonyanziano avatar Mar 11 '20 15:03 tonyanziano