[InstanceNorm Optimize x86] AVX512/AVX/SSE intrinsic with elempack merged
- Add the avx512/avx/sse inrinsic for instancenorm
Codecov Report
Merging #4062 (b0e9531) into master (00c08d7) will decrease coverage by
1.46%. The diff coverage is100.00%.
@@ Coverage Diff @@
## master #4062 +/- ##
==========================================
- Coverage 94.43% 92.97% -1.47%
==========================================
Files 748 749 +1
Lines 179005 178735 -270
==========================================
- Hits 169047 166181 -2866
- Misses 9958 12554 +2596
| Impacted Files | Coverage Δ | |
|---|---|---|
| src/layer/x86/instancenorm_x86.cpp | 100.00% <100.00%> (ø) |
|
| src/layer/x86/convolution_2x2_pack8.h | 2.75% <0.00%> (-97.25%) |
:arrow_down: |
| src/layer/x86/deconvolution_pack8.h | 10.76% <0.00%> (-89.24%) |
:arrow_down: |
| src/layer/x86/convolution_sgemm_pack8.h | 14.24% <0.00%> (-85.24%) |
:arrow_down: |
| src/layer/x86/convolution_sgemm_pack4to8.h | 29.16% <0.00%> (-70.84%) |
:arrow_down: |
| src/layer/x86/convolution_pack8.h | 34.42% <0.00%> (-65.58%) |
:arrow_down: |
| src/layer/x86/convolution_pack4to8.h | 42.85% <0.00%> (-55.11%) |
:arrow_down: |
| ...c/layer/x86/convolution_winograd_transform_pack8.h | 54.90% <0.00%> (-45.10%) |
:arrow_down: |
| src/layer/x86/convolution_3x3_pack1to8.h | 39.95% <0.00%> (-40.04%) |
:arrow_down: |
| src/layer/x86/convolution_winograd_dot_pack8.h | 60.24% <0.00%> (-39.16%) |
:arrow_down: |
| ... and 46 more |
Help us with your feedback. Take ten seconds to tell us how you rate us.
missing avx/avx512 optimization for pack4 and avx512 optimization for pack8 ?
If so, does the x86 part of batchnorm also need further optimization? @nihui
missing avx/avx512 optimization for pack4 and avx512 optimization for pack8 ?
If so, does the x86 part of batchnorm also need further optimization? @nihui
You could merge the multiple elempack codepath in batchnorm
missing avx/avx512 optimization for pack4 and avx512 optimization for pack8 ?
If so, does the x86 part of batchnorm also need further optimization? @nihui
You could merge the multiple elempack codepath in batchnorm
Ok, I will try to merge the elempack into one