mixtral-offloading icon indicating copy to clipboard operation
mixtral-offloading copied to clipboard

How to transform the orignal Mixtral 8*7B into the mixed HQQ quantized model ?

Open eljrte opened this issue 11 months ago • 0 comments

Really appreciate your great work. As it helps me to run MoE on a consumer GPU. I wonder how u transform the original Mixtral 8*7B into the quantized one using HQQ , as I found your model.safetensors.index.json very special , each part has its own safetensors. Do u have any script or can u tell me the way briefly? I appreciate it very much.

eljrte avatar Feb 06 '25 12:02 eljrte