ONE-PEACE
ONE-PEACE copied to clipboard
Vision data in datasets
Hi,
I would like to replicate some of the results and I started exploring the datasets. Its actually very nice that you have shared it but I was surprised to find only the audio data in some of them, for example AVQA, VGGSound and Kinetics400. Am I missing something there?