Mitchell Wortsman
Mitchell Wortsman
Any thoughts on if this can be merged?
i know @sagadre has for the OAI ViT models at least, but i don't think the open_clip ones, he'll know more
Are you fine-tuning for classification tasks or continuing to train on image-text pairs? In any case one other thing to try is linearly interpolating the weights before and after fine-tuning...
Great to hear! In case your interested some more background on that trick here: https://arxiv.org/abs/2109.01903
Hello, Thanks for the question! Do you mean in the convs? We are always setting `bias=False` in the convs (e.g., https://github.com/RAIVNLab/supsup/blob/master/models/builder.py#L29-L66).
Yes, it could! I don't know if that would make much of a difference.
This seems like more of a question for - https://github.com/uber-research/deconstructing-lottery-tickets - https://github.com/allenai/hidden-networks though I don't believe anyone has tried this! Very cool idea.
Oops! Sorry about that :) We tried skip-connections with resnets [here](https://arxiv.org/abs/1911.13299) which worked well. I believe dense-connections have not been explored with supermasks and it seems like a really interesting...
Thanks, that could definitely help!
Thank you, we have seen this but haven't taken a close look! Hopefully we can soon it seems awesome