Zizheng Pan

Results 7 comments of Zizheng Pan

Any updates on this issue? I'd also like to know how you apply Performer into your architecture, as shown in Table 5&6.

Hi @jingquanliang , I didn't see your result, have you fixed it? Or you can try this script. name=agent_bt flag="--attn soft --train validlistener --load snap/agent_bt/state_dict/best_val_unseen --angleFeatSize 128 --submit --featdropout 0.4...

Thanks for the help @czczup, I just tried this new code and unfortunately it still doesn't work. My setting is based on 8 GPUs and I can run other scales...

Hi @tanbuzheng, thanks for using HiLo. Can you give more details or a brief script of your testing? e.g., tensor shape of your feature map, concrete setting of HiLo including...

Thanks for the addtional information. If we only look at the Attention layer itself, theoretically HiLo attention should be faster than Local Window Attention, ```python from hilo import HiLo W...

Hi @q958287831, thank you for your interest! A video clip can be thought of as a list of frames. Initially, for HiLo, we have an input tensor in the shape...