vision-toolbox
vision-toolbox copied to clipboard
Add lightweight ViT
For learning purpose
- [x] ViT: https://arxiv.org/abs/2010.11929
- [x] MLP-Mixer: https://arxiv.org/abs/2105.01601
- [x] DeiT: https://arxiv.org/abs/2012.12877
- [x] DeiT-III: https://arxiv.org/abs/2204.07118
- [x] CaiT: https://arxiv.org/abs/2103.17239
- [x] Swin: https://arxiv.org/abs/2103.14030
- [ ] Swin V2: https://arxiv.org/abs/2111.09883
- [ ] MobileViT: https://arxiv.org/abs/2110.02178
- [ ] MobileViTv2: https://arxiv.org/abs/2206.02680
- [ ] MobileViTv3: https://arxiv.org/abs/2209.15159
- [x] ConvNeXt: https://arxiv.org/abs/2201.03545
- [x] ConvNeXt-V2: https://arxiv.org/abs/2301.00808
- [ ] TinyViT: https://arxiv.org/abs/2207.10666
- [ ] MobileOne: https://arxiv.org/abs/2206.04040
- [ ] EfficientViT: https://arxiv.org/abs/2305.07027
- [ ] GC ViT: https://arxiv.org/abs/2206.09959
- [ ] Fast ViT: https://arxiv.org/abs/2303.14189
- [ ] Faster ViT: https://arxiv.org/abs/2306.06189
- [ ] MaxViT: https://arxiv.org/abs/2204.01697
Probably will port weights.