MyButtermilk

Results 20 comments of MyButtermilk

This model is stronger: https://github.com/PaddlePaddle/PaddleOCR

Thank you very much for your relevant points. 1. I just tried it in Google AI Studio. This 19 Minutes long video https://www.youtube.com/watch?v=Lj7bsHnhoD8 has a pretty massive 330.601 Tokens with...

@nbonamy : So, what do you think?

Nice Design! Maybe add a drop zone? https://www.dropzone.dev/

A solution would be via support for pipecat: https://github.com/pipecat-ai/pipecat It supports natural streaming conversations with llms and is agnostic on the providers. They even just added Cartesia as they introduced...

Cartesia Ink-Whisper is more accurate than Fireworks: ![Image](https://github.com/user-attachments/assets/700e6c2c-1858-449a-9c3f-6d54fd5be687) And it is also faster: ![Image](https://github.com/user-attachments/assets/1695019b-d769-4661-b6c0-ef02fcd4e7a7) 5,5 hours / month of Transcription are free. Above that you need a subscription of 5...

Fireworks just launched a turnkey solution https://fireworks.ai/blog/voice-agents

Ultravox would be another alternative https://www.ultravox.ai/blog/ultravox-v0-5-taking-the-lead-in-speech-understanding Weights are available, here the 8b version: https://huggingface.co/fixie-ai/ultravox-v0_5-llama-3_1-8b

The implementation from Handy works well on Windows and Mac and he encourages people to fork it: https://github.com/cjpais/Handy