Patrick von Platen
Patrick von Platen
@ptillet anything we could do to help you implement this? With PyTorch 2.+ becoming more and more dependent on Triton this feature request will only become more and more important...
Cool - I'll try this :-)
Sadly, @stefan-it and I didn't get it to work yet :-/. As you can see on [this](https://huggingface.co/amazon/bort?text=Paris+is+the+%3Cmask%3E+of+France.) page the proposed word to fill the mask don't seem to be sensible....
Ah I think I forgot to write that you also need to install this library here: https://github.com/kpu/kenlm#installation Can you give the pip install command a try and see whether it...
In case it works it would be amazing if you could make a quick PR to update the `requirements.txt` and the README.md :-)
yeah this looks like an issue with the `.arpa` file - a good debugging strategy would be to: - take less text to create the ngram -> instead of a...
Hey @XiaoshanHsj sorry this library is not actively maintained anymore. Please have a look at https://huggingface.co/docs/transformers/model_doc/wav2vec2#transformers.Wav2Vec2ProcessorWithLM instead
That's a great PR @pcuenca! Think some clean-ups are left to make it functional, but overall the design seems like the correct design to me
Thanks a lot for the summary @pcuenca and great job finding the bug! Regarding the questions: 1.) I think we should **not** create flloat64 timesteps, but just keep float32 timesteps...
> Fantastic~ Should we expect the speedup to be less for non-English audio on the English distilled model? Not familiar with the ins and outs of speculative decoding. That really...