Mitchell Wortsman comments

Results 88 comments of


                                            Mitchell Wortsman

Utility for sycning with s3 and loading checkpoints from s3

ah, yea I should make this work with the new auto-resume which is where the conflict is coming from (https://github.com/mlfoundations/open_clip/pull/303). then yes I think good to merge after that

Utility for sycning with s3 and loading checkpoints from s3

merge conflict fixed but need to add support for the resume = 'latest' feature

Utility for sycning with s3 and loading checkpoints from s3

Ok, should be good to go.

Easier tutorial for beginners!

This is a great idea, is anyone interested in making one of these? Are there any specific questions you have? Datasets can be in csv or webdataset format (see https://github.com/rom1504/img2dataset)...

[WIP] Support FSDP

the `p.ndim < 2` check should also cover `logit_scale`

Feature Request: Simulating Larger Batch Sizes

You may be interested in https://github.com/mlfoundations/open_clip/pull/267

Add support for gradient accumulation.

@usuyama yep! if you check out the pseudocode above, it doesn't really depend on how loss is implemented

Add support for gradient accumulation.

Sounds good, using `--accum-freq k` is just over `k` times slower than `--accum-freq 1`

Add support for gradient accumulation.

Here is a screenshot verifying that training on 8 gpus with per-gpu batch size 512 behaves the same as training on 4 gpus with per-gpu batch size 512 and accum...

Add support for gradient accumulation.

> Cool! Is this an implementation of GradAccum in [BASIC](https://arxiv.org/pdf/2111.10050.pdf)? Not exactly but it looks like an overall similar approach.