chainlit icon indicating copy to clipboard operation
chainlit copied to clipboard

Speech to text

Open allisonmorrell opened this issue 1 year ago • 35 comments

I'm experimenting with Chainlit and it is awesome. Kudos to you all.

Future feature request: I would like to incorporate speech to text in the user interface for my application.

Please label as enhancement.

allisonmorrell avatar May 31 '23 20:05 allisonmorrell

Would you be able to contribute this feature? I would be willing to coach you with adding this functionality.

gruckion avatar Jun 06 '23 12:06 gruckion

I can't contribute in the short term but this is a good first issue (small and scoped to the UI). If any of you wants to make a PR (fork the repo and then open a PR against the Chainlit main branch) I would be more than happy to review it!

willydouhard avatar Jun 06 '23 16:06 willydouhard

Would you be able to contribute this feature? I would be willing to coach you with adding this functionality.

Unfortunately, I don't know any TypeScript/React. That said, we are hoping to involve more people in our project including someone with more front-end experience so if someone were to outline the recommended approach at a very high level I might be able to find someone who can contribute.

allisonmorrell avatar Jun 08 '23 16:06 allisonmorrell

Can someone assign this to me?

mmnasser2000 avatar Sep 07 '23 16:09 mmnasser2000

Hi @willydouhard , I am also trying to do, if i come up wuth something good, I will ask you to assign it to me and I will send the request to review it. Thanks !

sqrt676 avatar Oct 07 '23 18:10 sqrt676

is this implemented?

poojitharamachandra avatar Oct 13 '23 06:10 poojitharamachandra

poojitharamachandra

On it's way https://github.com/Chainlit/chainlit/pull/375

alimtunc avatar Oct 13 '23 06:10 alimtunc

do you know in which version this will be released?

poojitharamachandra avatar Oct 13 '23 09:10 poojitharamachandra

is this available in the latest version 0.7.604? if so how to use it? https://docs.chainlit.io/backend/config/features

poojitharamachandra avatar Nov 24 '23 14:11 poojitharamachandra

Yes it is available in 0.7.604. To enable it just, update your config to:

[features]
prompt_playground = true
multi_modal = true
# Allows user to use speech to text
[features.speech_to_text]
    enabled = true
    language = "en-US"

willydouhard avatar Nov 24 '23 16:11 willydouhard

where should i make the above changes?

poojitharamachandra avatar Nov 25 '23 04:11 poojitharamachandra

.chainlit/config.toml – discussion of features section here.

allisonmorrell avatar Nov 25 '23 05:11 allisonmorrell

thanks. where can I find this file in windows system?

poojitharamachandra avatar Nov 25 '23 05:11 poojitharamachandra

It is located in the same directory you ran chainlit run ... at .chainlit/config.toml.

willydouhard avatar Nov 25 '23 10:11 willydouhard

thanks, now I see the option to record. But I am not able to record n post voice-based questions. Can u recommend a tutorial for that and for migration from 0.6.x to 0.7.x?

poojitharamachandra avatar Nov 25 '23 14:11 poojitharamachandra

Here is the migration guide to 0.7.x

willydouhard avatar Nov 25 '23 15:11 willydouhard

is there an example on how to use speech?

poojitharamachandra avatar Nov 25 '23 16:11 poojitharamachandra

Hi @willydouhard, really nice feature. But i want to custom the model for speech to text part in python, how can i do that?

LeDuySon avatar Nov 29 '23 10:11 LeDuySon

This is not supported yet. We rely on the browser APIs. Opened to contributions to optionally make it use a custom endpoint on the Chainlit Fast API server!

willydouhard avatar Nov 29 '23 10:11 willydouhard

is there an example on how to use speech? i am unable to record my voice from the UI

poojitharamachandra avatar Dec 02 '23 05:12 poojitharamachandra

Might not work on all browsers since it relies on browser APIs. What browser do you use?

willydouhard avatar Dec 02 '23 11:12 willydouhard

I use edge. I see the button for voice recording on UI, but the question is not posted to the chat window as text. I even tried with mozilla, but I don't see option of voice recording at all. which browser do you recommend?

poojitharamachandra avatar Dec 05 '23 04:12 poojitharamachandra

Can you try with chrome?

willydouhard avatar Dec 05 '23 09:12 willydouhard

due to security issues, I cannot use chrome on corporate network

poojitharamachandra avatar Dec 05 '23 09:12 poojitharamachandra

Can you try with chrome?

I know I am late to the party but I just tested Speech To Text in 0.7.700 and 1.0.0RC3 and the latest Chrome (120) on MacOS. I can test in other environments (Windows, Ubuntu) and browsers if you have specific requests.

137particles avatar Jan 02 '24 23:01 137particles

Does it work as intended @137particles ?

willydouhard avatar Jan 03 '24 11:01 willydouhard

@willydouhard Yes. It correctly prompted for permission for the microphone and when granted I was able to click the icon and it captured what I said and then it did what appeared to be silence detection and turned the microphone off again.

137particles avatar Jan 03 '24 17:01 137particles

@137particles can you help to enable the mic in chainlit I have changed the config.toml file [features.speech_to_text] enabled = true
language = "en-US"

but it's not working.

jitkoley-kasmo avatar Jan 09 '24 04:01 jitkoley-kasmo

What browser do you use?

willydouhard avatar Jan 11 '24 08:01 willydouhard

Same thing happening with me.

[features.speech_to_text]
enabled = true
language = "en-US"

I edited config.toml file. I am using Chrome browser. ✅ I gave microphone permission as well. ✅ I can see microphone button as well. When I click on microphone button it start and automatically stop within 2 second. I don't know whether it record something. Because there is no log captured.

@willydouhard Can please give me some direction? And thank you man this library is awesome. 🔥🔥Very easy to use.

Girrajjangid avatar Jan 25 '24 15:01 Girrajjangid