whisper.cpp
whisper.cpp copied to clipboard
tdrz and coreml support?
Setting up a new macbook pro, m2, added coreml, works great! Except with new trdz feature.
running ./models/generate-coreml-model.sh small.en-tdrz
is missing from conversion script list of options.
Traceback (most recent call last):
File "/whisper.cpp/models/convert-whisper-to-coreml.py", line 306, in <module>
raise ValueError("Invalid model name")
ValueError: Invalid model name
coremlc: error: Model does not exist at models/coreml-encoder-small.en-tdrz.mlpackage -- file:////whisper.cpp/
mv: rename models/coreml-encoder-small.en-tdrz.mlmodelc to models/ggml-small.en-tdrz-encoder.mlmodelc: No such file or directory
cc @akashmjn, who proposed the original (awesome) PR. This is important for speed improvements on iPhones/Mac.
Was able to workaround this by adding
https://github.com/akashmjn/tinydiarize/blob/886a8d3523c5bf6e8bfc57f3441f1ce6f4345ad4/whisper/init.py#L23 https://github.com/akashmjn/tinydiarize/blob/886a8d3523c5bf6e8bfc57f3441f1ce6f4345ad4/whisper/init.py#L40
to the whisper package installed in my miniconda environment
~/miniconda3/envs/py310-whisper/lib/python3.10/site-packages/whisper/__init__.py
Probably a better way, but it worked heh...
@akeybl do you have this model uploaded somewhere? Perhaps we could add it to the same folder on HuggingFace where @akashmjn already saved the non CoreML version? https://huggingface.co/akashmjn/tinydiarize-whisper.cpp
Hey @akeybl thanks for the cc! i was on break for a bit, hence the delay. Looks like i missed the coreml conversion in this PR https://github.com/ggerganov/whisper.cpp/pull/1001
I'll take a look and fix this later today (both conversion from pytorch checkpoint, and directly supporting small.en-tdrz
in download-coreml-model.sh
)
Excited to see someone give this a spin on an iPhone!
@akashmjn would it also be possible to get tiny.en-tdrz
?
1). Regarding a finetuned tiny.en-tdrz
, i'd tried it but it didn't actually work very well. Likely because it is a very weirdly shaped model (token embeddings are >50% of total params).
2). I just looked into generate-coreml-model.sh
. Everything should work - you just need to ensure your local whisper
package finds the small.en-tdrz
checkpoint name.
This can be done either by
- replacing the python
openai-whisper
package in your env with my forkpip install https://github.com/akashmjn/tinydiarize.git
- just hacking
whisper.__init__.py
as @akeybl did (totally valid 🙂 ) by adding a path to my mirrored pytorch checkpoint.
3). Regarding download-coreml-model.sh
or hosting of pre-converted coreML checkpoints: it appears from https://github.com/ggerganov/whisper.cpp/pull/566 that until things stabilize with coreML, the maintainer (ggerganov)'s recommendation is that everyone convert locally themselves. So in the meantime would adding a -tdrz
section to ./models/README.md help?
I don't actually have the right Mac/iPhone hardware atm to test the CoreML stuff so let me know how it goes.
1). Regarding a finetuned tiny.en-tdrz , i'd tried it but it didn't actually work very well. Likely because it is a very weirdly shaped model (token embeddings are >50% of total params).
@akashmjn, would a base.en-tdrz have the same issue?
The link has expired. Does anyone have the pt file hosted somewhere else? https://github.com/akashmjn/tinydiarize/blob/886a8d3523c5bf6e8bfc57f3441f1ce6f4345ad4/whisper/init.py#L23
The link has expired. Does anyone have the pt file hosted somewhere else? https://github.com/akashmjn/tinydiarize/blob/886a8d3523c5bf6e8bfc57f3441f1ce6f4345ad4/whisper/init.py#L23
Interested in this as well!
The link has expired. Does anyone have the pt file hosted somewhere else? https://github.com/akashmjn/tinydiarize/blob/886a8d3523c5bf6e8bfc57f3441f1ce6f4345ad4/whisper/init.py#L23
If any of you have worked with the tdrz model last year, could you please check your ~/.cache/whisper
for small.en-tdrz.pt
? The ggml version is on Hugging Face, but the original .pt file seems to be lost to time.
I tried the ggml model to load, don't understand how to load this model https://huggingface.co/akashmjn/tinydiarize-whisper.cpp/tree/main