blueoil [WIP] a new network GlazedYolo

This network is experimental, so far it could not run on FPGAs.

Description

This network brings several recent ideas to our YOLOv2 implementation. In short, GlazedYolo = YoloV2 + BlazeFace + MixConv + Group Convolution.

Architecture difference from LMFYolo

mainly using 5x5 convolution instead of 3x3 convolution
sometimes stride=2 convolution is used
using residual connections
group convolution is heavily used
downsampling rate is 16 (LMFYolo's down sampling rate is 32)

I guess the last difference is the most effective one from the point of view of performance. GlazedYolo achieves following performance number (the number is mAP@IoU=0.5).

	WIDER_FACE (160x160)	PASCALVOC (320x320)	GOPs@160x160
LMFYolo (quantized)	0.559	0.446	0.582
GlazedYolo (quantized)	0.727	0.472	0.697

On PASCALVOC, the difference is not much, but there's huge difference on WIDER_FACE. When input image size is enlarged to 320x320, GlazedYolo (quantized) achieves 81.9% mAP on WIDER_FACE dataset.

Further direction

To make it easier to run it on our accelerator, I'm planning following experiments

replace stride=2 conv with max pooling or space_to_depth
remove 5x5 conv

Motivation and Context

We want to have better network for object detection, witout changing computation cost drastically.

How has this been tested?

Check accuracy by executing through some experiments.

Screenshots (if appropriate):

None

Types of changes

[ ] Bug fix (non-breaking change which fixes an issue)
[x] New feature / Optimization (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist:

[ ] My change requires a change to the documentation.
[ ] I have updated the documentation accordingly.

Sep 18 '19 05:09 tkng

To make it easier to run it on our accelerator, I'm planning following experiments

replace stride=2 conv with max pooling or space_to_depth

remove 5x5 conv

If we are talking about the new one stride=2 & 5x5 sounds ok to me, if the amount of compute doesn't change... :thinking:

Sep 25 '19 01:09 n-nez

To support stride=2, 5x5conv on cpu: very easy: 5x5conv easy: stride=2 for AArch32 hard (or no optimization, just ignore unused results): stride=2 for other architectures

Sep 25 '19 02:09 primenumber

All committers have signed the CLA.

Jun 12 '20 06:06 CLAassistant

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

Jun 12 '20 06:06 CLAassistant

blueoil blueoil copied to clipboard

[WIP] a new network GlazedYolo

Description

Architecture difference from LMFYolo

Further direction

Motivation and Context

How has this been tested?

Screenshots (if appropriate):

Types of changes

Checklist:

blueoil
blueoil copied to clipboard