GroupViT
GroupViT copied to clipboard
Mistakes in the GCC Dataset download commands
Hi! I realized that the commands for downloading GCC 3M and 12M have a couple of typos. The corrected version for the 12M is below:
sed -i '1s/^/url\tcaption\n/' gcc12m.tsv
img2dataset --url_list gcc12m.tsv --input_format "tsv" \
--url_col "url" --caption_col "caption" --output_format webdataset\
--output_folder local_data/gcc12m_shards \
--processes_count 16 --thread_count 64 \
--image_size 512 --resize_mode keep_ratio --resize_only_if_bigger True \
--enable_wandb True --save_metadata False --oom_shard_count 6
It would be nice if you can update the README.me with it. Great work! Thank for sharing it.
Thx! Will do!