Mitchell Wortsman comments

Results 88 comments of


                                            Mitchell Wortsman

Add support for gradient accumulation.

Any thoughts on if this can be merged?

GradCAM visualizations

i know @sagadre has for the OAI ViT models at least, but i don't think the open_clip ones, he'll know more

(Feature Request) Model EMA

Are you fine-tuning for classification tasks or continuing to train on image-text pairs? In any case one other thing to try is linearly interpolating the weights before and after fine-tuning...

(Feature Request) Model EMA

Great to hear! In case your interested some more background on that trick here: https://arxiv.org/abs/2109.01903

About fixing the bias in the layers

Hello, Thanks for the question! Do you mean in the convs? We are always setting `bias=False` in the convs (e.g., https://github.com/RAIVNLab/supsup/blob/master/models/builder.py#L29-L66).

About fixing the bias in the layers

Yes, it could! I don't know if that would make much of a difference.

This seems like more of a question for - https://github.com/uber-research/deconstructing-lottery-tickets - https://github.com/allenai/hidden-networks though I don't believe anyone has tried this! Very cool idea.

Densenets supermask

Oops! Sorry about that :) We tried skip-connections with resnets [here](https://arxiv.org/abs/1911.13299) which worked well. I believe dense-connections have not been explored with supermasks and it seems like a really interesting...

Densenets supermask

Thanks, that could definitely help!

Densenets supermask

Thank you, we have seen this but haven't taken a close look! Hopefully we can soon it seems awesome