cv-models
cv-models copied to clipboard
Models for Computer Vision
Computer Vision Models
Backbones
- [x]
AlexNet- ImageNet Classification with Deep Convolutional Neural Networks, NeurIPS, 2012 - [x]
VGGNets- Very Deep Convolutional Networks for Large-Scale Image Recognition, 2014 - [x]
GoogLeNet- Going Deeper with Convolutions, 2014 - [x]
Inception-V3- Rethinking the Inception Architecture for Computer Vision, 2015 - [x]
Inception-V4 and Inception-ResNet- Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning, AAAI, 2016 - [x]
ResNet- Deep Residual Learning for Image Recognition, 2015 - [x]
SqueezeNet- SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size, 2016 - [x]
ResNeXt- Aggregated Residual Transformations for Deep Neural Networks, CVPR, 2016 - [ ]
Res2Net- Res2Net: A New Multi-scale Backbone Architecture, TPAMI, 2019 - [x]
ReXNet- Rethinking Channel Dimensions for Efficient Model Design, CVPR, 2020 - [x]
Xception- Xception: Deep Learning with Depthwise Separable Convolutions, CVPR, 2016 - [x]
DenseNet- Densely Connected Convolutional Networks, CVPR, 2016 - [ ]
DLA- Deep Layer Aggregation, CVPR, 2017 - [ ]
DPN- Dual Path Networks, NeurIPS, 2017 - [ ]
NASNet-A- Learning Transferable Architectures for Scalable Image Recognition, CVPR, 2017 - [ ]
PNasNet- Progressive Neural Architecture Search, ECCV, 2017 - [x]
MobileNets- MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications, 2017 - [x]
MobileNetV2- MobileNetV2: Inverted Residuals and Linear Bottlenecks, CVPR, 2018 - [x]
MobileNetV3- Searching for MobileNetV3, ICCV, 2019 - [x]
ShuffleNet- ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices, CVPR, 2017 - [x]
ShuffleNetV2- ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design, ECCV, 2018 - [x]
MnasNet- MnasNet: Platform-Aware Neural Architecture Search for Mobile, CVPR, 2018 - [x]
GhostNet- GhostNet: More Features from Cheap Operations, CVPR, 2019 - [ ]
HRNet- Deep High-Resolution Representation Learning for Visual Recognition, TPAMI, 2019 - [ ]
CSPNet- CSPNet: A New Backbone that can Enhance Learning Capability of CNN, CVPR, 2019 - [x]
EfficientNet- EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks, ICML, 2019 - [x]
EfficientNetV2- EfficientNetV2: Smaller Models and Faster Training, ICML, 2021 - [x]
RegNet- Designing Network Design Spaces, CVPR, 2020 - [ ]
GPU-EfficientNets- Neural Architecture Design for GPU-Efficient Networks, 2020 - [ ]
LambdaNetworks- LambdaNetworks: Modeling Long-Range Interactions Without Attention, ICLR, 2021 - [ ]
RepVGG- RepVGG: Making VGG-style ConvNets Great Again, CVPR, 2021 - [ ]
HardCoRe-NAS- HardCoRe-NAS: Hard Constrained diffeRentiable Neural Architecture Search, ICML, 2021 - [ ]
NFNet- High-Performance Large-Scale Image Recognition Without Normalization, ICML, 2021 - [ ]
NF-ResNets- Characterizing signal propagation to close the performance gap in unnormalized ResNets, ICLR, 2021 - [x]
ConvMixer- Patches are all you need?, 2021 - [x]
VGNets- Efficient CNN Architecture Design Guided by Visualization, ICME, 2022 - [x]
ConvNeXt- A ConvNet for the 2020s, CVPR, 2022
Attention Blocks
- [x]
Non-Local- Non-local Neural Networks, CVPR, 2017 - [x]
Squeeze-and-Excitation- Squeeze-and-Excitation Networks, CVPR, 2017 - [x]
Gather-Excite- Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks, NeurIPS, 2018 - [x]
CBAM- CBAM: Convolutional Block Attention Module, ECCV, 2018 - [x]
SelectiveKernel- Selective Kernel Networks, CVPR, 2019 - [x]
ECA- ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks, CVPR, 2019 - [x]
GlobalContext- GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond, 2019 - [ ]
ResNeSt- ResNeSt: Split-Attention Networks, 2020 - [ ]
HaloNets- Scaling Local Self-Attention for Parameter Efficient Visual Backbones, 2021
Transformer
- [x]
ViT- An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, ICLR, 2020 - [ ]
DeiT- Training data-efficient image transformers & distillation through attention, ICML, 2020 - [ ]
Swin Transformer- Swin Transformer: Hierarchical Vision Transformer using Shifted Windows, ICCV, 2021 - [ ]
Twins- Twins: Revisiting the Design of Spatial Attention in Vision Transformers, NeurIPS, 2021
MLP
- [x]
MLP-Mixer- MLP-Mixer: An all-MLP Architecture for Vision, NeurIPS, 2021 - [x]
ResMLP- ResMLP: Feedforward networks for image classification with data-efficient training, 2021 - [ ]
gMLP- Pay Attention to MLPs, 2021
Self-supervised
- [ ]
MAE- Masked Autoencoders Are Scalable Vision Learners, CVPR, 2021
Object Detection
- [ ]
R-CNN- Rich feature hierarchies for accurate object detection and semantic segmentation, CVPR, 2013 - [ ]
Fast R-CNN- Fast R-CNN, ICCV, 2015 - [ ]
Faster R-CNN- Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, 2015 - [x]
YOLOv1- You Only Look Once: Unified, Real-Time Object Detection, 2015 - [ ]
SSD- SSD: Single Shot MultiBox Detector, ECCV, 2015 - [ ]
FPN- Feature Pyramid Networks for Object Detection, 2016
Semantic Segmentation
- [x]
FCN- Fully Convolutional Networks for Semantic Segmentation, CVPR, 2014 - [x]
UNet- U-Net: Convolutional Networks for Biomedical Image Segmentation, MICCAI, 2015 - [ ]
PSPNet- Pyramid Scene Parsing Network, CVPR, 2016 - [x]
DeepLabv3- Rethinking Atrous Convolution for Semantic Image Segmentation, 2017 - [x]
DeepLabv3+- Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation, CVPR, 2018 - [ ]
Mask R-CNN- Mask R-CNN, 2017
Generative Models
GANs
- [x]
GAN- Generative Adversarial Networks, 2014 - [x]
DCGAN- Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, ICLR, 2016 - [ ]
WGAN- Wasserstein GAN, 2017
VAEs
- [x]
VAE- Auto-Encoding Variational Bayes, 2013 - [x]
CVAE- Learning Structured Output Representation using Deep Conditional Generative Models , NeurIPS, 2015 - [ ]
β-VAE- beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework, ICLR, 2017
Diffusion Models
Flow-based
Adversarial Attacks
- [x]
FGSM- Explaining and Harnessing Adversarial Examples, ICLR, 2014 - [x]
PGD- Towards Deep Learning Models Resistant to Adversarial Attacks, ICLR, 2017