Weihang Wang
Weihang Wang
why is the problem still not solved? :(
Hello, I have received your email.
> Hey! Not a paper author here, but I'm currently working on reproducing the results of OpenMoe paper specificaly on token routing. Take a look: https://github.com/Misterion777/moe-experiments/blob/main/notebooks/routing_eda.ipynb Would appreciate any collaboration!...
Hello, I have received your email.
Why have you added warnings only for the initialization process and not for renaming during loading as well? The model I'm using is timm's convnext (which is even the companion...
same :( seens it still not be sloved
> One more thing is that the model you are using is not quantized to FP8. It is FP16. Hello, thank you for your reply. My launch command follows the...
> One more thing is that the model you are using is not quantized to FP8. It is FP16. I'm curious about this. According to the calculations on the website...
> > Have you guys added special tokens to your tokenizer but do not resize lm_embedding leads to a mismatch between labels class and lm_head. It seems that they are...