lorax
lorax copied to clipboard
Support loading `.pt` weights
Feature request
Need support for loading models that only contain .pt weights
Motivation
I quantized Mixtral 8x7b model using HQQ (which produces a qmodel.pt file). But I am unable to load the weights in LoRAX as it expects either a .safetensors or .bin weights.
Your contribution
I haven't studied the source enough to submit a PR but from cursory understanding of the code, changes need to be made in hub.py file, specifically: https://github.com/predibase/lorax/blob/cc2e0a90380c1342ea39cc483f3db8230cbf8d05/server/lorax_server/utils/sources/hub.py#L68-L78
Though I would also like to be able to load the base model from local rather than remote/from the hub (as explained in this issue: https://github.com/predibase/lorax/issues/347)
I will work on a fix for this alongside #347
Looks like we just need to support .pt extension as an alternative to .bin (it should be the same underlying format).
As a workaround @shripadk can you try renaming the file to qmodel.bin?