ReferFormer
ReferFormer copied to clipboard
Vision Language Early fusion
I saw that you did not mention anything about the early-fusion module that you used in your paper. However, in your code, that module is utilized before the Transformer module. I think this simple module contributes a lot to the result. Can you explain about this?
Thank you!
https://github.com/wjn922/ReferFormer/blob/9c8f237adc260c512a1c5ecfc7aee81b8282649a/models/referformer.py#L141 https://github.com/wjn922/ReferFormer/blob/9c8f237adc260c512a1c5ecfc7aee81b8282649a/models/referformer.py#L243