Perceiver_VL icon indicating copy to clipboard operation
Perceiver_VL copied to clipboard

Query about combining modality indicator

Open ShahRutav opened this issue 2 years ago • 0 comments

Hi,

Thanks for making your work open source! From the paper, I understand that you are adding all the four embeddings (modality, temporal, positional, patch/token) [Section 3.1]. However in the codebase, from what I understood, you are concatenating the modality indicator with the sum of other three [Here]. I might be missing something basic, please let me know!

Thanks and Regards, Rutav.

ShahRutav avatar Feb 09 '23 18:02 ShahRutav