efficientvit Hyperparameter request for reproducibility

I'm training the segmentation EfficientViT B1 on Cityscapes, and achieving ~0.6 mIoU, however the reported results are around 0.8 mIoU.

Would you be able to offer some guidance or share more details around the hyperparameters?

My setup is:

1024 x 2048 resolution
Backbone starts from the ImageNet checkpoints you have provided
Num Epochs: 100
LR: 0.005 with cosine annealing to 0
Batch Size: 2 (I'm limited by hardware at this resolution)
AdamW optimizer
Focal Loss w/ equal class weights

Any augmentations? Anything else that can help?

Aug 16 '24 16:08 ovunctuzel-bc

I'm training the segmentation EfficientViT B1 on Cityscapes, and achieving ~0.6 mIoU, however the reported results are around 0.8 mIoU.

Would you be able to offer some guidance or share more details around the hyperparameters?

My setup is:

1024 x 2048 resolution

Backbone starts from the ImageNet checkpoints you have provided

Num Epochs: 100

LR: 0.005 with cosine annealing to 0

Batch Size: 2 (I'm limited by hardware at this resolution)

AdamW optimizer

Focal Loss w/ equal class weights

Any augmentations? Anything else that can help?

Hi @ovunctuzel-bc , this seems good hyperparameters for training. First of all one thing to ask, there is no official release of training code for segmentation EfficientViT except for SAM variant right? How did u get the code reference. If you just guide then it would be very useful for me as well.

Aug 17 '24 18:08 Sanath1998

A fairly standard pytorch training loop seems to work fine. The results are satisfactory but not quite at the level of the pretrained model.

Aug 22 '24 00:08 ovunctuzel-bc