Jingcheng Hu
Jingcheng Hu
Struggle-lsl has done a great job to scrape the data. However, I just found that in https://github.com/shiqiangw/iclr2024-scores, the same goal can be achieved with openreview-python official api, which makes it...
I am seeing similar issue, though not exactly the same. I am seeing the loss will be slightly higher than before-resume, and this is really strange. If I rm the...
I am also curious about this! Is there anyone have any conclusions on this?
Yes! I am also wondering when will we have such a tutorial! It will be of great use.
Looking forward to this, too!
Maybe regressing to older version of flash-attn is sufficient. I succeed with flash-attn==1.0.4
> @puyuanOT OK i got the solution. Try to disable the hybirdengine, this make the model always repeat 'a a a a a' not sure the reason. I also meet...