Computer-Vision
Computer-Vision copied to clipboard
Toy implementations of CNNs
Tutorials of Computer Vision
This repo includes some implementations of Computer Vision algorithms using tf2+. Codes are easy to read and follow. If you can read Chinese, I have a teaching website for studying AI models.
All toy implementations are organised as following:
- CNN
- Numpy Convolution mechanism
- LeNet
- VGG
- GoogLeNet
- ResNet
- DenseNet
- SENet
- MobileNetV1
- MobileNetV2
- Xception
- ShuffleNetV1
- ShuffleNetV2
Installation
$ git clone https://github.com/MorvanZhou/Computer-Vision
$ cd Computer-Vision
$ pip install -r requirements.txt
ConvMechanism
Convolution mechanism and feature map
code - gif result
data:image/s3,"s3://crabby-images/15742/1574240447aa7f71c0a030d636f19659bf121213" alt="net structure"
LeNet
Gradient-Based Learning Applied to Document Recognition
code - net structure
data:image/s3,"s3://crabby-images/7e04a/7e04a8385f832cedfc2db60b218c80e1ac945ff6" alt="net structure"
VGG
Very Deep Convolutional Networks for Large-Scale Image Recognition
Deep stacked CNN.
code - net structure
data:image/s3,"s3://crabby-images/c0955/c0955dd46a2365abb7e2347d5950a21d8edf589b" alt="net structure"
GoogLeNet
Going Deeper with Convolutions
Multi kernel size to capture different local information
code - net structure
data:image/s3,"s3://crabby-images/1f85d/1f85dca686a0f021907dd359423ebfffbf721453" alt="net structure"
ResNet
Deep Residual Learning for Image Recognition
Add residual connection for better gradients.
code - net structure
data:image/s3,"s3://crabby-images/1e4ab/1e4ab438c64d8690229ba220315b4ef5beef1692" alt="net structure"
DenseNet
Densely Connected Convolutional Networks
Compared with resnet, it has less filter each conv, sees more previous inputs.
code - net structure
data:image/s3,"s3://crabby-images/5a788/5a788bd485873d0ca194ff2f4537d53130ac5ac4" alt="net structure"
SENet
Squeeze-and-Excitation Networks
SE is a module that learns to scale each feature map, it can be plugged in many cnn block, larger reduction_ratio reduce parameter size in FC layers with limited accuracy drop.
code - net structure
data:image/s3,"s3://crabby-images/f25d6/f25d6857353bb83e5ea184214055885ebeb78225" alt="net structure"
MobileNetV1
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Decomposed classical conv to two operations (dw+pw). Small but effective cnn optimized on mobile (cpu).
code - net structure
data:image/s3,"s3://crabby-images/ba3c2/ba3c24fbffb835cf015dac1da3fbc7c460e3f8d8" alt="net structure"
MobileNetV2
MobileNetV2: Inverted Residuals and Linear Bottlenecks
MobileNet v2 is v1 with residual block and layer rearrange (residual+pw+dw+pw):
- mobilenet v1: dw > pw
- mobilenet v2: pw > dw > pw let dw see more feature maps
code - net structure
data:image/s3,"s3://crabby-images/e5226/e5226bb5475f79bc9ee83651c524d64f347f552b" alt="net structure"
Xception
Xception: Deep Learning with Depthwise Separable Convolutions
Just like MobileNetV2 without last pw (residual+pw+dw).
code - net structure
data:image/s3,"s3://crabby-images/03ac3/03ac392941c2314a4952c43bdb5e59dccfdbe814" alt="net structure"
ShuffleNetV1
ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
Shuffle the output from 1x1 conv, and do group conv to reduce connections and speed up computing. But MobileNet is better in this case, this may caused by group conv cuts off some feature map communications.
code - net structure
data:image/s3,"s3://crabby-images/64308/64308cc1146dfcee459c96c441cbf4793d81b4e6" alt="net structure"
ShuffleNetV2
ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design
Further reduces parameters by switching group conv with split+concat, perform shuffle at end of block. Speed up calculation. But MobileNet is better in this case, this may caused by group conv cuts off some feature map communications.
code - net structure
data:image/s3,"s3://crabby-images/d45bf/d45bfe4b88b8371268d90a67af3703342fbd7664" alt="net structure"