faster-whisper
faster-whisper copied to clipboard
Integrate HF PeFT adapter models in cTranslate or faster-whisper
@guillaumekln I am not sure if it is more apt for ctranslate2 but since the HF Parameter efficient finetuning is also done for whisper for specific languages, I am posting it here.
HF whisper models are converted to ctranslate format. Is there a way of doing it with Peft-adapted models as well? They are also in a similar format and we should be able to combine them optionally at inference time. It is fine to alternatively first convert it into a single model in ctranslate and disable adapter to get the baseline whisper performance at the inference time.
This will be really useful for those who want to use whisper+add performance improvements for specific languages.
It is fine to alternatively first convert it into a single model
I’m not a PEFT expert but it seems to me that you can already do this before running the conversion to CTranslate2.
See for example https://github.com/huggingface/peft/issues/308
@guillaumekln That is great! How should one disable the adapters /work with the baseline model at inference after converting them into CTranslate2? In Peft models, you can do this by with model.disable_adapter():
@guillaumekln Is there a way we can disable the finetuned weight matrices (or make it identity) in the final converted model at the run time in ctranslate? It is definitely an interesting way forward.
Currently the adapters should be merged in the base model before converting to CTranslate2. So disabling the adapters here simply means using the base model directly.
To support the adapters we need to make changes in CTranslate2. I'm closing this issue in favor of https://github.com/OpenNMT/CTranslate2/issues/1186