Jingcheng Hu

Results 7 comments of Jingcheng Hu

Struggle-lsl has done a great job to scrape the data. However, I just found that in https://github.com/shiqiangw/iclr2024-scores, the same goal can be achieved with openreview-python official api, which makes it...

I am seeing similar issue, though not exactly the same. I am seeing the loss will be slightly higher than before-resume, and this is really strange. If I rm the...

I am also curious about this! Is there anyone have any conclusions on this?

Yes! I am also wondering when will we have such a tutorial! It will be of great use.

Looking forward to this, too!

Maybe regressing to older version of flash-attn is sufficient. I succeed with flash-attn==1.0.4

> @puyuanOT OK i got the solution. Try to disable the hybirdengine, this make the model always repeat 'a a a a a' not sure the reason. I also meet...