Pingzhi Li

Results 4 comments of Pingzhi Li

Thank you for your interest! This is an important suggestion - to evaluate the final performance of MC-SMoE on Mixtral. We haven't tried it yet, as distillation on large models...

Hi thank you for your interest. I think this is due to version issue. Could you try downgrading `transformers` to 4.34.1?

solved by adding `NCCL_P2P_DISABLE=1`, but still confused and worried about the performance. would greatly appreciate it if someone could kindly help on this. :)

Hi thanks for your question. During both merging and compression, the scripts will by default run evaluation every N steps. You should already have the evaluation results now.