Arun Reddy
Arun Reddy
The ClipBERT paper (Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling) which appears under the Information Retrieval section was released in the pre-CLIP era and is in fact...
I see [here](https://github.com/xuguohai/X-CLIP/blob/main/modules/modeling_xclip.py#L130) that X-CLIP uses learnable weight matrices to compute the final scores from the similarity vectors/matrices. However, I am having trouble reconciling this with the equations in Section...
I would like to add a paper to this list that was published in CVPR 2024. It is an unsupervised domain adaptation method for video action recognition which uses self-supervised...