colian
colian
I see.. Thanks for your fast reply. Could u share some materials that introduce this "sp" scheme for full attention? I don't quite understand that why separates frames into different...
非常感谢!参考 DEEPSPEED ULYSSES 这篇论文,我看明白了。
Thanks for your reply, by the way, have you ever delved into the reasons for this? For example, is it caused by some samples having a large bias?
Look forward to the release of the code regarding Navit.
@LinB203 Sry I did not see your fast reply!!, the monitoring memory seems normal, maybe the wandb has not capture the broken steps. I don't have much data (less than...