UniVL icon indicating copy to clipboard operation
UniVL copied to clipboard

How to only input text feature or video feature

Open tingchihc opened this issue 2 years ago • 2 comments

I want to only input text feature or video feature in UniVL. In this paper, it said that one transformer combines text representation T and video representation V. Could you tell me how to change it to only input T or V into UniVL? thanks

tingchihc avatar Aug 03 '22 22:08 tingchihc

Hi @ting-chih, sorry for the delayed reply. The model will also need T and V, which can be masked if you need only to input one of them. For example, for only V, T is [CLS][SEP], and for only T, V is all zero. Best~

ArrowLuo avatar Aug 08 '22 05:08 ArrowLuo

I want to only input text feature or video feature in UniVL. In this paper, it said that one transformer combines text representation T and video representation V. Could you tell me how to change it to only input T or V into UniVL? thanks

Hi! Do you know how to download the raw videos of YouCook2? Thank you very much!

tiesanguaixia avatar May 20 '23 15:05 tiesanguaixia