vision icon indicating copy to clipboard operation
vision copied to clipboard

MaxVit model

Open TeodorPoncu opened this issue 2 years ago • 5 comments

This PR is w.r.t. Batteries Phase 3 proposal to add the MaxVit architecture. It is still a work in progress as it has yet to be trained.

One caveat w.r.t. the way we would be exposing this model API to users is that the architecture is bound to the specific input size it was trained one (due to the usage of relative positional encodings)

Running the command: torchrun --nproc_per_node=1 train.py --test-only --prototype --weights MaxVit_T_Weights.IMAGENET1K_V1 --model maxvit_t -b 1 yields the following results:

Test: Acc@1 83.700 Acc@5 96.722

TeodorPoncu avatar Aug 01 '22 14:08 TeodorPoncu

@TeodorPoncu It seems that in a recent commit, you accidentally updated all the expected files for all models. Could you please revert that?

datumbox avatar Aug 05 '22 18:08 datumbox

@datumbox Sorry about that, everything should be fine now.

@TeodorPoncu It seems that in a recent commit, you accidentally updated all the expected files for all models. Could you please revert that?

TeodorPoncu avatar Aug 05 '22 19:08 TeodorPoncu

Related discussion and pointers on generalizing fixed resolution for Swin: https://github.com/pytorch/vision/issues/6227

Also, I wonder if more relative-attention related modules can be reused from Swin

vadimkantorov avatar Aug 19 '22 17:08 vadimkantorov

Running the deployed weights with the following command: torchrun --nproc_per_node=1 train.py --model maxvit_t --interpolation bicubic --batch-size 1 --test-only --weights MaxVit_T_Weights.IMAGENET1K_V1

Yields the following results: Test: Acc@1 83.700 Acc@5 96.722

TeodorPoncu avatar Sep 21 '22 15:09 TeodorPoncu

Two more requests:

  • Could you please upload the weights on manifold (see internal guide)
  • Could you update the PR description to show-case the output accuracy of the following command?
torchrun --nproc_per_node=1 train.py --test-only --prototype --weights MaxVit_T_Weights.IMAGENET1K_V1 --model maxvit_t -b 1

datumbox avatar Sep 21 '22 16:09 datumbox

Hey @TeodorPoncu!

You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py

github-actions[bot] avatar Sep 23 '22 12:09 github-actions[bot]