CrissCross icon indicating copy to clipboard operation
CrissCross copied to clipboard

About training and testing dataset

Open kaiw7 opened this issue 1 year ago • 1 comments

Hi Pritam, thank you very much for your amazing work. I have some questions about the dataset you used in this work. The pretrained dataset : K400, AudioSet and Kinetics-Sound, do you always use both audio and visual information, and do they always contain audio stream? Because I am trying k400, but I found some videos miss audio stream. In addition, the downstream dataset like UCF-101 and HMDB-51, do you use both audio and visual pairs , or just use visual information for evaluation? It seems that videos files in UCF-101 do not always contain the audio stream. Thank you very much.

kaiw7 avatar May 30 '23 15:05 kaiw7

Also, could you please share the link for downloading Kinetics-Sound dataset you used in this work?

kaiw7 avatar May 30 '23 15:05 kaiw7