Mitchell Wortsman

Results 88 comments of Mitchell Wortsman

The example finetune.py may be helpful -- it's for imagenet but I believe the structure is similar

@Djoels hmm.. your getting that plot following the steps in the repository? Or doing something custom? Because when we follow the steps we get the figure you see here https://github.com/mlfoundations/model-soups

@ivovdongen would love to help if possible! can you let me know some more detail about what is your task, what is your network, and how you are fine-tuning? one...

Thanks! Ok, as suspected I think the issue in terms of model soups performance is that you're introducing new parameters. Looking at your model definiton ``` vit_b32 = Sequential([ layers.Input(shape=(224,224,3),...

I'm not familiar with TF/Keras but at a high level looks good! Just trying to make your experimental setting more similar to what we consider. Concretely, in the paper we...

great! should be fine to include, this is typically what we do.

We have not tried regression models.. but I don't really see why that wouldn't work. I'm confused by this: > I get approximately 91% accuracy on a held-out test using...

Hmmm. Are you introducing new params when fine-tuning? What LR?

Can you try just souping the small LR models, e.g., 1e-05+AdamW and 2e-05+RMSProp. I think the LR may just be too high for the other models