dalle-mini icon indicating copy to clipboard operation
dalle-mini copied to clipboard

customized dataset

Open zhuang-li opened this issue 2 years ago • 1 comments

Hi sorry for this naive question as I am new to computer vision and DALLE. Say if I have a folder of images and captions such as

dir/ cat.jpg cat.txt dog.jpg dog.txt

how to convert the folder into the data format the DALLE-mini can accept if I want to train a new model? Is there any script in this repo?

I was using "https://github.com/lucidrains/DALLE-pytorch" the data format for their model is the above format but I am not clear what is the data format used here.

zhuang-li avatar Sep 03 '22 07:09 zhuang-li

This is the required format: https://github.com/borisdayma/dalle-mini/blob/main/tools/dataset/encode_dataset.ipynb

borisdayma avatar Sep 04 '22 15:09 borisdayma