llll comments

Results 5 comments of


                                            llll

visualization

Hello, I would like to know which step of inference should be taken for attention weight, and which stage of attention weight should be taken when generating each word？Thanks

visualization

I tried get attention weights from the last decoder's cross-attention's last head,maybe you can try it. @not-hermione

question about CUDA memory for SCST

Thank you very much! I use ruotian luo's code [ImageCaptioning.pytorch(https://github.com/ruotianluo/ImageCaptioning.pytorch) and use swin-transformer instade of bottom-up feature when train, and it can run about 9G memory for SCST. But i...

question about CUDA memory for SCST

Thanks a lot! I will try your advice for training.Thank you very much for your patience again!

question about CUDA memory for SCST

Dear Author! Sorry to bother you! I have tried your suggestion and used swin-transformer to extract image features, but it got 2-3 CIDER points lower than use image just in...