TemporalAlignNet icon indicating copy to clipboard operation
TemporalAlignNet copied to clipboard

Why is my count for the number of videos in htm_aa_v1.csv 247, 564 instead of 370K?

Open fake-warrior8 opened this issue 2 years ago • 4 comments

Hi, I downloaded htm_aa_v1.csv from the Oxford server you given, I used np.unique to count the video list and found only 247, 564 videos but not 370k videos. Is there something wrong?

fake-warrior8 avatar Jul 08 '22 02:07 fake-warrior8

htm_aa_v1.csv should be the output of TAN, which will be used for downstream representation learning. It is not the HTM370k training set for TAN.

yuanzheng625 avatar Jul 08 '22 17:07 yuanzheng625

htm_aa_v1.csv should be the output of TAN, which will be used for downstream representation learning. It is not the HTM370k training set for TAN.

Will you release a list of video IDs for HTM370K?

fake-warrior8 avatar Jul 09 '22 01:07 fake-warrior8

I am waiting for the author to release their training code and training set as well.

yuanzheng625 avatar Jul 09 '22 01:07 yuanzheng625

Hi, HTM-370K is ready to download from the project page. We provide two files:

  1. YouTube ASR sentences: [sentences, start-timestamps, end-timestamps] for each video,
  2. The video IDs in a txt file.

We recently find YouTube ASR quality improves over time -- the ASR file released by the HowTo100M team 3 years ago (https://www.di.ens.fr/willow/research/howto100m/) has many low-quality ASR texts. So we re-downloaded the YouTube ASR files in June and went through the pre-processing pipeline, to release these ASR files with better quality.

I'm working on the rest. Thank you for your patience.

TengdaHan avatar Aug 04 '22 00:08 TengdaHan