stylegan2-flax-tpu icon indicating copy to clipboard operation
stylegan2-flax-tpu copied to clipboard

Details on subset extraction

Open rom1504 opened this issue 2 years ago • 4 comments

Hey I read "The image datasets were built using the LAION5B index" That's great, glad the index was useful!

I'm curious how you used it exactly, can you provide some more details?

rom1504 avatar Jul 21 '22 13:07 rom1504

Would be interested as well. There is talk of labels being used in training, based on folder name, but I don't see them used anywhere for generation / inference?

Ontopic avatar Jul 22 '22 09:07 Ontopic

Hello @rom1504

First of all thanks for your work on the LAION index, it helped us considerably!

Our workflow was simple:

  • Find an image representative of the subject we want to train. Eg for cheesecake-256 we used this one: https://i.pinimg.com/736x/34/db/ee/34dbeee7079c54aeb10543910285c82d--ny-cheesecake-recipe-new-york-style-cheesecake.jpg
  • Search using the image’s CLIP embedding on knn5.laion.ai, getting ~250k results
  • Download them with img2dataset
  • Deduplicate, crop, filter etc until we were happy with the resulting dataset (50-100k images)
  • Train the model!

Two notes:

  • We currently do MD5 deduplication. It’s very handy that img2dataset can just compute those at download time. However that’s of course not sufficient, many images are close but not exact duplicates. One solution would be to request deduplicated results from the knn5.laion.ai backend, however this results in timeout for >60k images (and also we feel bad running queries that take full minutes of your server’s time).
  • We’d want to do some CLIP filtering on the resulting dataset (including near-duplicate filtering but also removing some classes of images, eg chocolate cakes from cheesecakes). But, there doesn’t seem to be a way to recover the CLIP embedding of an image from its LAION ID from knn5.laion.ai? That would have been useful to us.

For these two reasons, and also because we generally intend to scale up the datasets we work with, we are planning to run our own LAION backend soon.

MasterScrat avatar Jul 22 '22 13:07 MasterScrat

Sounds good! Providing embeddings as output and faster deduplication are features that could definitely be implemented in clip-retrieval.

I'm glad this was useful to you.

rom1504 avatar Jul 22 '22 16:07 rom1504

Hey ! The generation networks that you provided us with, produce exceptionally good quality images. Since I'm working in a research project of synthetically image detection, can you also provide us with the representative queries for sushi, cocktail and cookie images? Thanks in advance!

dogoulis avatar Sep 05 '22 16:09 dogoulis

@dogoulis the representative images were:

  • sushi: https://fitpeople.com/tr/wp-content/uploads/2020/03/sushiler-e1584447473717.jpg
  • cookie: https://www.texanerin.com/content/uploads/2021/09/vegan-peanut-butter-chocolate-chip-cookies-1200-1-200x200.jpg
  • cocktail: https://cdn.wallpapersafari.com/72/95/oykiHN.jpg
  • cheesecake: https://i.pinimg.com/736x/34/db/ee/34dbeee7079c54aeb10543910285c82d--ny-cheesecake-recipe-new-york-style-cheesecake.jpg

MasterScrat avatar Oct 27 '22 21:10 MasterScrat