keras-cv icon indicating copy to clipboard operation
keras-cv copied to clipboard

Add pretrained EfficientNetV2b1 weights

Open ianstenbit opened this issue 3 years ago • 4 comments

Scores .756 on ImageNet top-1, vs .798 paper-claimed (~95% of claimed result)

ianstenbit avatar Oct 14 '22 16:10 ianstenbit

Is there any particular reason that the train accuracy (~55%) is way lower than val accuracy (~75%)?

tanzhenyu avatar Oct 14 '22 16:10 tanzhenyu

Is there any particular reason that the train accuracy (~55%) is way lower than val accuracy (~75%)?

This is due to the use of RandAugment+CutMix, which makes the training dataset much harder than the validation dataset

ianstenbit avatar Oct 14 '22 16:10 ianstenbit

Is there any particular reason that the train accuracy (~55%) is way lower than val accuracy (~75%)?

This is due to the use of RandAugment+CutMix, which makes the training dataset much harder than the validation dataset

That seems to suggest training loss will keep decreasing if we train more epochs, probably val loss as well?

tanzhenyu avatar Oct 14 '22 17:10 tanzhenyu

w/ or w/o training more epochs to verify it, I think this PR is good to go

tanzhenyu avatar Oct 14 '22 17:10 tanzhenyu

I've also seen training accuracy being significantly lower than validation accuracy when using RandAugment and CutMix/MixUp (~0.55 training, ~0.95 validation)

I like to think that this keeps the network aware that there's more room for improvement and won't stop updating weights to better fit the data, which is a common issue with high training accuracy. However, with more training, it might conceivably lower the validation accuracy by thinking it's more wrong than it really is? If they start diverging too much, maybe lowering the magnitude of random augmentation might help?

DavidLandup0 avatar Oct 15 '22 18:10 DavidLandup0

/gcbrun

ianstenbit avatar Oct 17 '22 01:10 ianstenbit

/gcbrun

ianstenbit avatar Oct 17 '22 04:10 ianstenbit

/gcbrun

ianstenbit avatar Oct 17 '22 14:10 ianstenbit