Awesome Dynamic Networks and Conditional Computation

Upcoming ICML 2022 on Dynamic Neural Networks! https://dynn-icml2022.github.io/ on Friday, July 22.

Overview of conditional computation and dynamic CNNs for computer vision, focusing on reducing computational cost of existing network architectures. In contrast to static networks, dynamic networks disable parts of the network based on the input image, at inference time. This can save computations and speed up inference, for example by processing easy images with fewer operations. Note that this list mainly focuses on methods reducing the computational cost of existing models (e.g. ResNet models), and does not list all methods that use dynamic computation for custom architectures.

This list is growing every day. If a method is missing or listed incorrectly, let me know by making a GitHub issue or pull request!

Here is a list with more static and dynamic methods for efficient CNNs.

Background

Methods have three important distinguishing factors:

The method's architecture, e.g. skipping layers or pixels, and whether these run-or-skip decisions are the result of a separate policy network, a submodule in the network or another mechanism.
The way of training the policy, e.g. using reinforcement learning, the gradient estimator such as Gumbel-Softmax or a custom approach.
The implementation of the method, and whether the method can be executed efficiently on existing platforms (i.e. whether the method speeds up inference, or only reduces the theoretical amount of computations)

Metrics: Most methods demonstrate performance with the reduction in computations (i.e. measured in floating point operations, FLOPS) compared to the loss in accuracy. Methods typically show figures where baseline models of different complexities (e.g. by reducing the number of channels) are compared to the method applied to the largest model with different cost savings.

Note that many works express computational complexity in FLOPS, even though the given numbers are actually multiply-accumulate operations (MACs), and GMACs = 0.5 * GFLOPs (see https://github.com/sovrasov/flops-counter.pytorch/issues/16 ). Some recent works therefore use GMAC instead of GFLOP to avoid ambiguity.

Tags used below: Note: tags are incomplete

VID: Video processing

Surveys / overviews

Dynamic Neural Networks: A Survey (Arxiv 2021) [pdf] Yizeng Han, Gao Huang, Shiji Song, Le Yang, Honghui Wang, Yulin Wang

Methods

Depth-based methods

Early-exit methods have separate output branches to apply more or fewer layers.

BranchyNet: Fast inference via early exiting from deep neural networks (ICPR2016) [pdf] [chainer]
Teerapittayanon S, McDanel B, Kung HT
Conditional Deep Learning for Energy-Efficient and Enhanced Pattern Recognition (DATE2016) [pdf]
P. Panda, A. Sengupta, and K. Roy
Adaptive Neural Networks for Efficient Inference (ICML2017) [pdf] [GitHub no code]
T. Bolukbasi, J. Wang, O. Dekel, and V. Saligrama
Dynamic computational time for visual attention (ICCV2017 workshop) [pdf] [torch lua]
Li, Z., Yang, Y., Liu, X., Zhou, F., Wen, S. and Xu, W.
DynExit: A Dynamic Early-Exit Strategy for Deep Residual Networks (SiPS2019) [pdf]
M. Wang, J. Mo, J. Lin, Z. Wang, and L. Du
Improved Techniques for Training Adaptive Deep Networks (ICCV2019) [pdf] [Pytorch]
H. Li, H. Zhang, X. Qi, Y. Ruigang, and G. Huang
Early-exit convolutional neural networks (thesis 2019) [pdf]
E. Demir
Efficient adaptive inference for deep convolutional neural networks using hierarchical early exits (Pattern Recognition 2020) [pdf]
N. Passalis, J. Raitoharju, A. Tefas, and M. Gabbouj
Triple wins: Boosting accuracy, robustness and efficiency together by enabling input-adaptive inference (ICLR2020) [pdf] [pytorch]
Hu TK, Chen T, Wang H, Wang Z.
FrameExit: Conditional Early Exiting for Efficient Video Recognition [pdf] Ghodrati, A., Bejnordi, B. E., & Habibian, A.
[VID]

Skipping layers conditioned on the input image. For instance, easy images require fewer layers than complex ones:

Adaptive Computation Time for Recurrent Neural Networks (NIPS 2016 Deep Learning Symposium) [pdf] [unofficial pytorch]
A. Graves
Convolutional Networks with Adaptive Inference Graphs (ECCV2018) [pdf] [Pytorch]
A. Veit and S. Belongie
SkipNet: Learning Dynamic Routing in Convolutional Networks (ECCV2018) [pdf] [Pytorch]
X. Wang, F. Yu, Z.-Y. Dou, T. Darrell, and J. E. Gonzalez
BlockDrop: Dynamic Inference Paths in Residual Networks (CVPR2018) [pdf] [Pytorch]
Zuxuan Wu*, Tushar Nagarajan*, Abhishek Kumar, Steven Rennie, Larry S. Davis, Kristen Grauman, and Rogerio Feris
Dynamic Multi-path Neural Network (Arxiv2019) [pdf]
Su, Y., Zhou, S., Wu, Y., Su, T., Liang, D., Liu, J., Zheng, D., Wang, Y., Yan, J. and Hu, X.
Energynet: Energy-efficient dynamic inference (2018) [pdf]
Wang, Yue, et al.
Dual dynamic inference: Enabling more efficient, adaptive and controllable deep inference (IEEE Journal of Selected Topics in Signal Processing 2020) [pdf]
Wang Y, Shen J, Hu TK, Xu P, Nguyen T, Baraniuk RG, Wang Z, Lin Y.
CoDiNet: Path Distribution Modeling with Consistency and Diversity for Dynamic Routing (TPAMI 2021) [pdf]

Executes some layers multiple times ('recursively') based on complexity:

IamNN: Iterative and Adaptive Mobile Neural Network for Efficient Image Classification (ICLR2018 Workshop) [pdf]
S. Leroux, P. Molchanov, P. Simoens, B. Dhoedt, T. Breuel, and J. Kautz
Dynamic recursive neural network (CPVR2019) [pdf]
Guo, Q., Yu, Z., Wu, Y., Liang, D., Qin, H., and Yan, J.

Channel-based methods

Channel-based methods execute specific channels to reduce computational complexity.

Estimating or propagating gradients through stochastic neurons for conditional computation [pdf]
Bengio Y, Léonard N, Courville A.
Runtime Neural Pruning (NIPS2017) [pdf]
J. Lin, Y. Rao, J. Lu, and J. Zhou
Dynamic Channel Pruning: Feature Boosting and Suppression (Arxiv2018) [pdf] [tensorflow] [unoffical pytorch]
X. Gao, Y. Zhao, Ł. Dudziak, R. Mullins, and C. Xu.
Channel Gating Neural Networks (NIPS2019) [pdf] [pytorch]
W. Hua, Y. Zhou, C. M. De Sa, Z. Zhang, and G. E. Suh
You Look Twice: GaterNet for Dynamic Filter Selection in CNNs (CVPR2019) [pdf]
Z. Chen, Y. Li, S. Bengio, and S. Si
Runtime Network Routing for Efficient Image Classification (TPAMI2019) [pdf]
Y. Rao, J. Lu, J. Lin, and J. Zhou
Dynamic Neural Network Channel Execution for Efficient Training (BMVC2019) [pdf]
S. E. Spasov and P. Lio
Learning Instance-wise Sparsity for Accelerating Deep Models (IJCAI2019) [pdf]
Liu C, Wang Y, Han K, Xu C, Xu C.
Batch-Shaping for Learning Conditional Channel Gated Networks (ICLR2020) [pdf]
BE Bejnordi, T Blankevoort, M Welling
Dynamic slimmable network (CVPR2021) [pdf] [pytorch]
Li, Changlin, et al.
Dynamic Slimmable Denoising Network. (2021) [pdf] Jiang, Zutao, Changlin Li, Xiaojun Chang, Jihua Zhu, and Yi Yang
DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and Transformers (2021) [pdf] Li, C., Wang, G., Wang, B., Liang, X., Li, Z., & Chang, X.
Borrowing from yourself: Faster future video segmentation with partial channel update (2022) [pdf]
Multi-dimensional dynamic model compression for efficient image super-resolution (WACV2022) [pdf]

Spatial methods

Spatial methods exploit spatial redundancies, such as unimportant regions, to save computations

Spatial per-pixel

PerforatedCNNs: Acceleration through Elimination of Redundant Convolutions (NIPS2016) [pdf] [matconvnet] [caffe]
M. Figurnov, A. Ibraimova, D. P. Vetrov, and P. Kohli
Spatially Adaptive Computation Time for Residual Networks (CVPR2017) [pdf] [tensorflow]
Figurnov M, Collins MD, Zhu Y, Zhang L, Huang J, Vetrov D, Salakhutdinov R.
Pixel-wise Attentional Gating for Parsimonious Pixel Labeling (WACV2019) [pdf] [matconvnet]
S. Kong and C. Fowlkes
Boosting the Performance of CNN Accelerators with Dynamic Fine-Grained Channel Gating (MICRO2019) [pdf] Weizhe Hua, Yuan Zhou, Christopher De Sa, Zhiru Zhang, and G. Edward Suh
Dynamic Convolutions: Exploiting Spatial Sparsity for Faster Inference (CVPR2020) [pdf] [Pytorch]
T. Verelst and T. Tuytelaars, “Dynamic Convolutions: Exploiting Spatial Sparsity for Faster Inference
Spatially Adaptive Inference with Stochastic Feature Sampling and Interpolation (ECCV2020) [pdf] [pytorch]
Z. Xie, Z. Zhang, X. Zhu, G. Huang, and S. Lin
Precision Gating: Improving Neural Network Efficiency with Dynamic Dual-Precision Activation (ICLR2020) [pdf] [pytorch]
Zhang Y, Zhao R, Hua W, Xu N, Suh GE, Zhang Z.
Dynamic Dual Gating Neural Networks (ICCV2021) [pdf]
Skip-Convolutions for Efficient Video Processing (CVPR2021) [pdf] [pytorch]
[VID]

Focal Sparse Convolutional Networks for 3D Object Detection (CVPR2022) [pdf]

Spatial per-block

SBNet: Sparse Blocks Network for Fast Inference (CVPR2018) [pdf] [tensorflow]
M. Ren, A. Pokrovsky, B. Yang, and R. Urtasun
Uncertainty based model selection for fast semantic segmentation (MVA2019) [pdf]
SegBlocks: Block-Based Dynamic Resolution Networks for Real-Time Segmentation (ECCV2020 Workshop) [pdf] Thomas Verelst and Tinne Tuytelaars
Spatially Adaptive Feature Refinement for Efficient Inference [pdf]
Y Han, G Huang, S Song, L Yang, Y Zhang, H Jiang
BlockCopy: High-Resolution Video Processing with Block-Sparse Feature Propagation and Online Policies (ICCV 2021) [pdf] Thomas Verelst and Tinne Tuytelaars
[VID]

Spatial warping

Learning to Zoom: A Saliency-Based Sampling Layer for Neural Networks (ECCV2018) [[pdf]] [pytorch] Adria Recasens, Petr Kellnhofer, Simon Stent, Wojciech Matusik, Antonio Torralba

Glances and dynamic crops

Takes crops to further refine predictions

Action Recognition using Visual Attention (ICLR 2016 Workshop) [pdf] [theano]
S. Sharma, R. Kiros, and R. Salakhutdinov
Recurrent Models of Visual Attention (NIPS2014) [pdf]
V. Mnih, N. Heess, A. Graves, and koray kavukcuoglu
Dynamic Capacity Networks (ICML2016) [pdf] [tensorflow] [unofficial pytorch]
A. Almahairi, N. Ballas, T. Cooijmans, Y. Zheng, H. Larochelle, and A. Courville
Glance and Focus: a Dynamic Approach to Reducing Spatial Redundancy in Image Classification [pdf]
Y. Wang, K. Lv, R. Huang, S. Song, L. Yang, and G. Huang
Learning Where to Focus for Eﬃcient Video Object Detection (ECCV2020) [pdf] [github]
Z. Jiang et al.
Adaptive Focus for Efficient Video Recognition (2021) [pdf]
Yulin Wang, Zhaoxi Chen, Haojun Jiang, Shiji Song, Yizeng Han, Gao Huang
[VID]
Adafocus v2: End-to-end training of spatial dynamic networks for video recognition (2021) [pdf] [VID]

Other (dilation etc)

D^2Conv3D: Dynamic Dilated Convolutions for Object Segmentation in Videos [pdf] Christian Schmidt, Ali Athar, Sabarinath Mahadevan, Bastian Leibe
[VID]

Adaptive resolution methods

Adaptive resolution methods adapt the processing resolution to the input image.

Resolution Adaptive Networks for Efficient Inference (CPVR2020) [pdf] [pytorch]
L. Yang, Y. Han, X. Chen, S. Song, J. Dai, and G. Huang
Resolution Switchable Networks for Runtime Eﬃcient Image Recognition (ECCV2020) [pdf] [pytorch]
Y. Wang, F. Sun, D. Li, and A. Yao
Dynamic Resolution Network (2021) [pdf]
Multi-dimensional dynamic model compression for efficient image super-resolution (WACV2022) [pdf]

Transformers

Dynamically Pruning Segformer for Efficient Semantic Segmentation (Arxiv2021) [pdf]
Haoli Bai, Hongda Mao, Dinesh Nair
Spatio-Temporal Gated Transformers for Efficient Video Processing (2021) [pdf]
Yawei Li, Babak Ehteshami Bejnordi, Bert Moons, Tijmen Blankevoort, Amirhossein Habibian, Radu Timofte, Luc Van Gool
[VID]
Not All Images are Worth 16x16 Words: Dynamic Transformers for Efficient Image Recognition (NIPS2021) [pdf]
Yulin Wang, Rui Huang, Shiji Song, Zeyi Huang, Gao Huang
Multi-Exit Vision Transformer for Dynamic Inference (2021) [pdf]
A Bakhtiarnia, Q Zhang, A Iosifidis
Dynamic Grained Encoder for Vision Transformers (NIPS2021) [pdf]
Song, Lin, Songyang Zhang, Songtao Liu, Zeming Li, Xuming He, Hongbin Sun, Jian Sun, and Nanning Zhen
A-ViT: Adaptive Tokens for Efficient Vision Transformer (CVPR2022) [pdf]

Dynamic filters/weights

Dynamic filter networks (NIPS2016) [pdf]
Jia, X., De Brabandere, B., Tuytelaars, T., & Gool, L. V.
Dynamic region-aware convolution (CVPR2021) [pdf]
Chen, J., Wang, X., Guo, Z., Zhang, X., & Sun, J.
Decoupled Dynamic Filter Networks (CVPR2021) [pdf]
Involution: Inverting the inherence of convolution for visual recognition (CPVR2021) [pdf] [pytorch] [unofficial tf]
Adaptive Convolutions with Per-pixel Dynamic Filter Atom (ICCV2021) [pdf]

Quantization

Instance-Aware Dynamic Neural Network Quantization (CVPR2022) [[pdf]] (https://openaccess.thecvf.com/content/CVPR2022/html/Liu_Instance-Aware_Dynamic_Neural_Network_Quantization_CVPR_2022_paper.html)

Mixture of experts

HydraNets: Specialized Dynamic Architectures for Efficient Inference (CVPR2019) [pdf]
Teja Mullapudi R, Mark WR, Shazeer N, Fatahalian K.
Outrageously large neural networks: The sparsely-gated mixture-of-experts layer (ICLR 2017) [pdf] [unofficial pytorch]
Shazeer N, Mirhoseini A, Maziarz K, Davis A, Le Q, Hinton G, Dean J.

Other

Video

(if not listed above with [VID] tag)

Leaky Gated Cross-Attention for Weakly Supervised Multi-Modal Temporal Action Localization (WACV2022) [pdf] [VID]
ELIχR: Eliminating Computation Redundancy in CNN-Based Video Processing (RSDHA2021) [IEEE] [ VID]

awesome-dynamic-conditional-networks-cv
awesome-dynamic-conditional-networks-cv copied to clipboard

Metadata

Awesome Dynamic Networks and Conditional Computation

Background

Surveys / overviews

Methods

Depth-based methods

Channel-based methods

Spatial methods

Spatial per-pixel

Spatial per-block

Spatial warping

Glances and dynamic crops

Other (dilation etc)

Adaptive resolution methods

Transformers

Dynamic filters/weights

Quantization

Mixture of experts

Other

Video

← Metadata

Owner

Metadata

awesome-dynamic-conditional-networks-cv awesome-dynamic-conditional-networks-cv copied to clipboard

Metadata

Awesome Dynamic Networks and Conditional Computation

Background

Surveys / overviews

Methods

Depth-based methods

Channel-based methods

Spatial methods

Spatial per-pixel

Spatial per-block

Spatial warping

Glances and dynamic crops

Other (dilation etc)

Adaptive resolution methods

Transformers

Dynamic filters/weights

Quantization

Mixture of experts

Other

Video

← Metadata

Owner

Metadata

awesome-dynamic-conditional-networks-cv
awesome-dynamic-conditional-networks-cv copied to clipboard