Romain Beaumont
Romain Beaumont
@lucidrains I merged your 3 changes in one branch, can you please validate if it looks good ?
to resume with this: put strict=False there https://github.com/mlfoundations/open_clip/blob/main/src/training/main.py#L199 and comment out the optimizer load (also possible to do it that way https://github.com/pytorch/pytorch/issues/34660#issue-580123812 but resetting optimizer state is probably good anyway)
We didn't end up needing this Closing for now
We didn't end up needing this Closing for now to keep things simple
opened https://github.com/mlfoundations/open_clip/pull/173
https://www.cs.rice.edu/~vo9/sbucaptions/ sbu captions is provided as url + captions in json. You can use that as input of img2dataset
https://visualgenome.org/api/v0/api_home.html visual genome is not distributed as image urls, so you can simply download the images and make a tar with them. That's what webdataset is.
Could you check if you have any config somewhere (eg the yarn cluster) that stops the cluster when a job stops ?
https://github.com/PrismarineJS/minecraft-data/issues/643
let's jump straight to 1.19.2 imo