MoE-Infinity
MoE-Infinity copied to clipboard
Evaluating Mixtral-8×7B-Instruct-v0.1-offloading-demo on MMLU
Hi!I'm currently running MoE-Infinity with Mixtral-8×7B-Instruct-v0.1-offloading-demo(the quantized version) on MMLU.I encountered a failure when loading the model weights, and I’d like to know whether the MoE-Infinity algorithm is compatible with the quantized version of the Mixtral model?Thanks!!!
do you have detailed log or concil output?
The output reflected that,for layers 0–26, only q_proj was loaded correctly; all other parameters failed to load. Layers 27–31 had no parameters loaded at all.