ActivityNet-Entities
ActivityNet-Entities copied to clipboard
A Dataset for Grounded Video Description
when i extract frames video v_Mzojo2EeWu8, segment 3, frame_id 8. I use code from https://github.com/facebookresearch/ActivityNet-Entities/issues/1#issuecomment-529065386. But i just get 8 pictures(frame_id 0-7).The frame labeled is the ninth picture. I guess...
Hi, are are the training, validation and test sets indicated by some annotation or do we have to split the data manually ? Thank you
In Anet_captions, segment3 is "We see the arena in blue and the sax player in red." In Anet_entities, segment3, none phrases is "in red" and "the arena in blue","the sax"...