Mitchell Wortsman comments

Results 88 comments of


                                            Mitchell Wortsman

3.4 NNs Scenario

That is a good idea, no we have not thought about this! It is difficult as supermasks which do the same thing could look very different.

OOM with batch size 1 when with ViT-bigG on 40GB GPU

sorry I mean `bigG` not `g`

OOM with batch size 1 when with ViT-bigG on 40GB GPU

seems like progress is being made with FSDP and also we think the OOM was because of model size + activations

Matryoshka Representation Learning for Open CLIP

Hey Aditya thanks for the PR with MRL -- however if you want to make MRL an option it would be good to have a flag so that this PR...

Matryoshka Representation Learning for Open CLIP

sure can you convert to draft in the meantime?

autoresume if experiment exists

Yea totally agree, and while I'll likely keep using this for my existing run I like your implementation better for the repo going forward so I'll close this. Thanks!

`logit_scale` and resume

closing because the hypothesis is that it relates to a filesystem issue which should not affect most

`logit_scale` and resume

Here's one hypothesis for what's going on. Look at the graph for `logit_scale` and `samples/s` towards the end of training -- the dips in `logit_scale` occur towards the end of...

Simple variants for `nn.Embedding` and `nn.LSTM`

Hi Adam this looks great! I don't have access to this repo anymore because I'm not on the internship but let's keep this issue open so that other people can...

Simple variants for `nn.Embedding` and `nn.LSTM`

And very nice paper -- thanks for sharing!