dsnote icon indicating copy to clipboard operation
dsnote copied to clipboard

[New Model] Parakeet-tdt-0.6b-v2 by NVIDIA

Open vernalan opened this issue 5 months ago • 4 comments

Parakeet-tdt-0.6b-v2 by NVIDIA is a 600-million-parameter automatic speech recognition (ASR) model designed for high-quality English transcription, featuring support for punctuation, capitalization, and accurate timestamp prediction.

Less than 3GB in size. It can output both normal text and SRT subtitles. I think it will be a great addition to dsnote's STT model choices for English language.

Link: https://huggingface.co/nvidia/parakeet-tdt-0.6b-v2

vernalan avatar Jul 02 '25 15:07 vernalan

Hi, thanks for the information. NVIDIA NeMo engine support would be needed to enable this model. I will investigate what can be done.

mkiol avatar Jul 06 '25 14:07 mkiol

It would be awesome 🙌🏻

navarrothiago avatar Jul 09 '25 23:07 navarrothiago

@mkiol just wanted to revive this to let you know there's an onnx-asr library that let you load parakeet:

https://github.com/istupakov/onnx-asr

jwinpbe avatar Aug 16 '25 10:08 jwinpbe

There has been an update to the model. Updated from v2 to v3. Link to updated model: https://huggingface.co/nvidia/parakeet-tdt-0.6b-v3

vernalan avatar Aug 16 '25 13:08 vernalan