EfficientNetV2-pytorch

Unofficial EfficientNetV2 pytorch implementation repository.

It contains:

Simple Implementation of model (here)
Pretrained Model (numpy weight, we upload numpy files converted from official tensorflow checkout point)
Training code (here)
Tutorial (Colab EfficientNetV2-predict tutorial, Colab EfficientNetV2-finetuning tutorial)
Experiment results

Index

Tutorial
Experiment results
Experiment Setup
References

Tutorial

Colab Tutorial

How to use model on colab? please check Colab EfficientNetV2-predict tutorial
How to train model on colab? please check Colab EfficientNetV2-finetuning tutorial
See how cutmix, cutout, mixup works in Colab Data augmentation tutorial

How to load pretrained model?

If you just want to use pretrained model, load model by torch.hub.load

import torch

model = torch.hub.load('hankyul2/EfficientNetV2-pytorch', 'efficientnet_v2_s', pretrained=True, nclass=1000)
print(model)

Available Model Names: efficientnet_v2_{s|m|l}(ImageNet), efficientnet_v2_{s|m|l}_in21k(ImageNet21k)

How to fine-tuning model?

If you want to finetuning on cifar, use this repository.

Clone this repo and install dependency

git clone https://github.com/hankyul2/EfficientNetV2-pytorch.git
pip3 install requirements.txt

Train & Test model (see more examples in tmuxp/cifar.yaml)

python3 main.py fit --config config/efficientnetv2_s/cifar10.yaml --trainer.gpus 2,3,

Experiment Results

Model Name	Pretrained Dataset	Cifar10	Cifar100
EfficientNetV2-S	ImageNet	98.46 (tf.dev, weight)	90.05 (tf.dev, weight)
EfficientNetV2-M	ImageNet	98.89 (tf.dev, weight)	91.54 (tf.dev, weight)
EfficientNetV2-L	ImageNet	98.80 (tf.dev, weight)	91.88 (tf.dev, weight)
EfficientNetV2-S-in21k	ImageNet21k	98.50 (tf.dev, weight)	90.96 (tf.dev, weight)
EfficientNetV2-M-in21k	ImageNet21k	98.70 (tf.dev, weight)	92.06 (tf.dev, weight)
EfficientNetV2-L-in21k	ImageNet21k	98.78 (tf.dev, weight)	92.08 (tf.dev, weight)
EfficientNetV2-XL-in21k	ImageNet21k	-	-

Note

The results are combination of
- Half precision
- Super Convergence(epoch=20)
- AdamW(weight_decay=0.005)
- EMA(decay=0.999)
- cutmix(prob=1.0)
Changes from original paper (CIFAR)
1. We just run 20 epochs to got above results. If you run more epochs, you can get more higher accuracy.
2. What we changed from original setup are: optimizer(SGD to AdamW), LR scheduler(cosinelr to onecylelr), augmentation(cutout to cutmix), image size (384 to 224), epoch (105 to 20).
3. Important hyper-parameter(most important to least important): LR->weigth_decay->ema-decay->cutmix_prob->epoch.
you can get same results by running tmuxp/cifar.yaml

Experiment Setup

Cifar setup

Category	Contents
Dataset	CIFAR10 \| CIFAR100
Batch_size per gpu	(s, m, l) = (256, 128, 64)
Train Augmentation	image_size = 224, horizontal flip, random_crop (pad=4), CutMix(prob=1.0)
Test Augmentation	image_size = 224, center_crop
Model	EfficientNetV2 s \| m \| l (pretrained on in1k or in21k)
Regularization	Dropout=0.0, Stochastic_path=0.2, BatchNorm
Optimizer	AdamW(weight_decay=0.005)
Criterion	Label Smoothing (CrossEntropyLoss)
LR Scheduler	LR: (s, m, l) = (0.001, 0.0005, 0.0003), LR scheduler: OneCycle Learning Rate(epoch=20)
GPUs & ETC	16 precision EMA(decay=0.999, 0.9993, 0.9995) S - 2 * 3090 (batch size 512) M - 2 * 3090 (batch size 256) L - 2 * 3090 (batch size 128)

References

EfficientNetV2

Title: EfficientNetV2: Smaller models and Faster Training
Author: Minxing Tan
Publication: ICML, 2021
Link: Paper | official tensorflow repo | other pytorch repo
Other references:
- Training ImageNet in 3 hours for USD 25; and CIFAR10 for USD 0.26
- AdamW and Super-convergence is now the fastest way to train neural nets

EfficientNetV2-pytorch
EfficientNetV2-pytorch copied to clipboard

Metadata

EfficientNetV2-pytorch

Index

Tutorial

How to load pretrained model?

How to fine-tuning model?

Experiment Results

Experiment Setup

References

← Metadata

Owner

Metadata

EfficientNetV2-pytorch EfficientNetV2-pytorch copied to clipboard

Metadata

EfficientNetV2-pytorch

Index

Tutorial

How to load pretrained model?

How to fine-tuning model?

Experiment Results

Experiment Setup

References

← Metadata

Owner

Metadata

EfficientNetV2-pytorch
EfficientNetV2-pytorch copied to clipboard