LGI4temporalgrounding icon indicating copy to clipboard operation
LGI4temporalgrounding copied to clipboard

"video_masks" and "grounding_att_masks"

Open TAY-985 opened this issue 3 years ago • 2 comments

hello, May i ask you a question? what is the difference between "video_masks" and "grounding_att_masks", i know the "grounding_att_masks", but i do not understand "video_masks" ? how it is used? 1636530570(1) 1636530596(1)

TAY-985 avatar Nov 10 '21 07:11 TAY-985

Video mask is used to indicate the length of video and guide the network to compute attention weights only on unmasked positions. This is required in batch-level training. For example, when we have two videos of different lengths---first one having 3 clip features second one having 5 features--- the mask is shaped as follows:

[ [ 1 1 1 0 0]
  [ 1 1 1 1 1] ]

JonghwanMun avatar Nov 10 '21 08:11 JonghwanMun

ok ,i see. thank you

TAY-985 avatar Nov 10 '21 09:11 TAY-985