Mitchell Wortsman
Mitchell Wortsman
Would be nice if we could add a `jax_src` folder which supported training CLIP models in [Jax](https://github.com/google/jax). This would also help with https://github.com/mlfoundations/open_clip/issues/20.
Even with buffer size and initial are set to the size of the dataset, I am not seeing completely a completely shuffled dataset. I have created a toy dataset with...
Added a new flag `--accum-freq` (accumulation frequency) which defaults to 1. If this is greater than 1, then the optimizer is only stepped every `--accum-freq` batches. Can be combined with...
Make sure that bn_type is set to NonAffineBN for SplitCIFAR experiments! This was an error in our initial release of this repo.
Similarly to https://github.com/mlfoundations/open_clip/issues/261, getting OOM with batch size 1 on 40GB GPU with ViT-G.
If there is an experiment which already exists with the same `--name`, auto-resume from it instead of exiting.
I'm noticing that `logit_scale` will steeply change directions on resume. Probably not a huge deal as it stays fairly close to 100, but worth tracking this issue in case others...
This PR introduces two additional arguments, which are `--sync-s3` and `--sync-s3-frequency`. Recommended use is to do `--sync-s3 s3://` and `--logs /scratch/logs` which is hopefully local ssd. Then, as you run,...