img2dataset icon indicating copy to clipboard operation
img2dataset copied to clipboard

Cannot download from subset of LAION-5B

Open dogoulis opened this issue 2 years ago • 2 comments

I have created a .txt file with the URLs taken from the cheescake class on the LAION-5B database. I run the command img2dataset --url_list=subset.txt --output_folder=out and it produces the following error:

CSV parse error: Expected 1 columns, got 2: https://img1.baidu.com/it/u=1798831392,1750834138&fm=15&fmt=auto&gp=0.jpg

dogoulis avatar Jul 28 '22 07:07 dogoulis

Hi, you're hitting https://github.com/rom1504/img2dataset/issues/196 In the mean time I advise you use a better format than txt. For example the laion5B search demo provides you json files if you click on the download button

rom1504 avatar Jul 28 '22 07:07 rom1504

It worked with the .json file provided. Thanks!

dogoulis avatar Jul 28 '22 07:07 dogoulis

closing in favor of https://github.com/rom1504/img2dataset/issues/196

rom1504 avatar Aug 21 '22 21:08 rom1504