CATER Actions per frame

trafficstars

Hi Authors, Thanks for your work. I tried generating 3 videos to test the dataset. while actions_order_dataset seems to return frame, label and classes, The output file (train.txt) under folder action_order_uniq contains no information about it.

It contains input something like /images/CLEVR_new_000002.avi 53,54,60,69,70,71,72,74,77,78,81,83,129,138,144,153,155,156,157,161,162,165,167,173,179,187,188,195,197,198,200,203,204,207,209,257,263,264,265,270,272,279,281,282,284,287,288,291,292,293,381,382,383,387,389,390,392,396,398,405,407,408,410,411,412,413,414,415,417,419,423,425,430,431,432,434,438,440,447,449,450,452,455,456,459,460,461,465,471,474,480,489,490,491,492,495,497,498,501,502,509,515,518,524,532,533,536,539,545,549,551,555,557,558,560,564,565,566,573,575,576,577,578,580,581,582,585,586,587

How can i get frame by frame actions and classes?

Feb 09 '21 11:02 Rajawat23

Hi, thanks for your interest. The actions_order task is a multi-label classification task where we pre-define a set of action order classes and the list that you see is the indices of classes that are active at some point in the video.

To get actions active at any given frame, you should be able to use the movements metadata, like this.

Feb 16 '21 17:02 rohitgirdhar

Am I right in the assumption that the whole 10s video is the input and the whole list of classes is the label? I.e. the output is a 301 length vector describing whether this class was present at any time in the 10s video.

Feb 02 '23 19:02 Ramtin-Nouri

Yes that is correct.

Feb 03 '23 21:02 rohitgirdhar

CATER CATER copied to clipboard

Actions per frame

CATER
CATER copied to clipboard