Can I save the quantized model to disk to avoid calling `torch.quantization.quantize_dynamic` each times?

Open shy2052 opened this issue 2 years ago • 2 comments

I've managed to run custom_whisper.py, but I'm thinking if we can save the quantization to disk, and let the whisper CLI (with the modified nn.Linear part) use it like other official models.

Jan 20 '23 09:01 shy2052

Actually this part already exists in the code (here) It is: torch.save(model.state_dict(), path)

Jan 20 '23 13:01 frankiedrake

IMPORTANT! What I have noticed is you won't be able to load whisper model out of this file, because it required some extra params. To save-> load the model correctly just use torch.save(model, path) and then torch.load(path) (omitting state_dict() func call)

Jan 31 '23 13:01 frankiedrake