pablogranolabar
Results
23
comments of
pablogranolabar
Yeah for example: https://github.com/huggingface/transformers/pull/17901
Ah sorry I was referring to the Accelerate framework used with PyTorch. Here's a decent writeup of their 8-bit quantization methods: https://huggingface.co/blog/hf-bitsandbytes-integration
Nah Whisper is configurable for whatever length inputs you specify, we have a Flutter port going now that is near realtime on mobile. The larger models on CPU should be...