Add new LLM: NVIDIA parakeet 2
NVIDIA has just released a free LLM for transcriptions: https://huggingface.co/spaces/nvidia/parakeet-tdt-0.6b-v2
How to move it to my app, anyone got some basic step through screenshot?
is it any plan to integrate with https://github.com/senstella/parakeet-mlx
streaming transcription with parakeet-mlx would be so cool, yes!
@Beingpax ⬆︎⬆︎ thank you!
MacWhisper just shipped an update with this model. Looks really interesting and would appreciate in VoiceInk as well. The combination of high accuracy, high speed, streaming and proper Apple Silicon support makes it seem pretty awesome.
Agreed. I'm giving VoiceInk a try after coming from MacWhisper. The additional support for Parakeet V2 would be amazing. It's such a great model, and I think it would enhance VoiceInk significantly.
MacWhisper is running on Whisper Kit Pro, which is based on a subscription model. See pricing here.
I love the incredible speed of Parakeet models. However, based on my experience, they are slightly less accurate than the Whisper Large V3 Turbo model or even the larger models from Whisper.
At the moment, my main focus is on improving the speed of Whisper models for transcription.
@Beingpax ~~I can see "Parakeet" was released. Is this specifically "Parakeet 2"?~~ Edit: I can see it is, but the label displays "Parakeet".
@ezuk @pietz @jenningsb2 you can download the "Settings" > "AI Models" > "Local" > "Parakeet". In the latest version of VoiceInk.
I've tested it... and it's blazing fast and accurate (English).
@ezuk off-topic: I wonder how this can complement the zsa keyboard that I use?
@ezuk @pietz @jenningsb2 you can download the "Settings" > "AI Models" > "Local" > "Parakeet". In the latest version of VoiceInk.
Thanks for calling this out. I went ahead and enabled this, and also edited my fallback "Power Mode" to actually use that model (otherwise it stayed on Whisper, as it was set before).
@ezuk off-topic: I wonder how this can complement the zsa keyboard that I use?
This is actually a really interesting question. For me, the goal is to get into a state of flow when I'm in front of the computer. Basically, to have the machinery fade away and have something that feels like a direct interface from my mind into the screen.
Sometimes dictating gets me there, and other times using a nice keyboard and enjoying the tactility of typing gets me there.
It also depends on what exactly I'm doing. If I'm editing text or code, then naturally I'm going to use mainly my keyboard. If I'm trying to put out a large amount of prose at once and just get my thoughts out, dictating can be better. Another big thing with dictating is that I can just talk into my phone when I'm out walking or something, and then massage it into shape later (with my keyboard, at my computer).
Generally, even when I dictate (and even when I tidy it up with AI), I still end up reading with my eyes and editing with my keyboard every time.
Does that help? How do you do it? (By the way, what ZSA keyboard do you use?)
True I still use the keyboard. I'm on the ZSA Ergodex EZ
Parakeet v3 is now multilingual:
- https://huggingface.co/nvidia/parakeet-tdt-0.6b-v3
- https://huggingface.co/mlx-community/parakeet-tdt-0.6b-v3
- https://huggingface.co/alexwengg/parakeet-tdt-0.6b-v3-coreml
Please add latest parakeet version, it would be awesome