opencv
opencv copied to clipboard
GAPI Fluid: SIMD SSE41 for Resize F32C1.
SIMD SSE41 for the Resize F32C1.
Performance: ResizeF32C1_SIMD_SSE.xlsx
force_builders=Linux AVX2,Custom,Custom Win,Custom Mac
build_gapi_standalone:Linux x64=ade-0.1.1f
build_gapi_standalone:Win64=ade-0.1.1f
Xbuild_gapi_standalone:Mac=ade-0.1.1f
build_gapi_standalone:Linux x64 Debug=ade-0.1.1f
build_image:Custom=centos:7
buildworker:Custom=linux-1
build_gapi_standalone:Custom=ade-0.1.1f
Xbuild_image:Custom=ubuntu-openvino-2021.3.0:20.04
build_image:Custom Win=openvino-2021.4.1
build_image:Custom Mac=openvino-2021.2.0
buildworker:Custom Win=windows-3
test_modules:Custom=gapi,python2,python3,java
test_modules:Custom Win=gapi,python2,python3,java
test_modules:Custom Mac=gapi,python2,python3,java
buildworker:Custom=linux-1
# disabled due high memory usage: test_opencl:Custom=ON
Xtest_opencl:Custom=OFF
Xtest_bigdata:Custom=1
Xtest_filter:Custom=*
CPU_BASELINE:Custom Win=AVX512_SKX
CPU_BASELINE:Custom=SSE4_2
@anna-khakimova friendly reminder.
@TolyaTalamanov Is the patch still relevant? If so, could you drive it to merge by yourself?
@TolyaTalamanov Is the patch still relevant? If so, could you drive it to merge by yourself?
@dmatveev Could you comment? I don't recall for what we do this
@dmatveev Friendly reminder.
@asmorkalov I'll have a look on this. At least will actualize the patch and see what effect it gives for F32 resize.
@rgarnov could you please have a look?
Checked performance of 4.x against this branch:
dm@irlvmvdmitryma:~/code/opencv_fluid/modules/ts/misc$ python3 ./summary.py ~/code/opencv_fluid_BUILD/4x.xml ~/code/opencv_fluid_BUILD/anna_simd.xml | grep 32FC1
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 128x128, 0.5, 0.5, { gapi.kernel_package }) 0.012 0.012 0.97
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 128x128, 0.5, 0.25, { gapi.kernel_package }) 0.007 0.008 0.95
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 128x128, 0.5, 2, { gapi.kernel_package }) 0.037 0.038 0.96
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 128x128, 0.25, 0.5, { gapi.kernel_package }) 0.008 0.008 0.98
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 128x128, 0.25, 0.25, { gapi.kernel_package }) 0.006 0.006 0.97
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 128x128, 0.25, 2, { gapi.kernel_package }) 0.023 0.023 1.00
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 128x128, 2, 0.5, { gapi.kernel_package }) 0.033 0.033 1.00
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 128x128, 2, 0.25, { gapi.kernel_package }) 0.018 0.018 1.00
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 128x128, 2, 2, { gapi.kernel_package }) 0.120 0.120 1.00
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 640x480, 0.5, 0.5, { gapi.kernel_package }) 0.148 0.148 1.00
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 640x480, 0.5, 0.25, { gapi.kernel_package }) 0.075 0.075 1.00
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 640x480, 0.5, 2, { gapi.kernel_package }) 0.557 0.550 1.01
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 640x480, 0.25, 0.5, { gapi.kernel_package }) 0.088 0.088 0.99
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 640x480, 0.25, 0.25, { gapi.kernel_package }) 0.043 0.044 0.98
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 640x480, 0.25, 2, { gapi.kernel_package }) 0.298 0.298 1.00
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 640x480, 2, 0.5, { gapi.kernel_package }) 0.534 0.535 1.00
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 640x480, 2, 0.25, { gapi.kernel_package }) 0.271 0.270 1.00
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 640x480, 2, 2, { gapi.kernel_package }) 2.119 2.117 1.00
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 1280x720, 0.5, 0.5, { gapi.kernel_package }) 0.416 0.423 0.98
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 1280x720, 0.5, 0.25, { gapi.kernel_package }) 0.212 0.216 0.98
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 1280x720, 0.5, 2, { gapi.kernel_package }) 1.639 1.625 1.01
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 1280x720, 0.25, 0.5, { gapi.kernel_package }) 0.233 0.236 0.99
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 1280x720, 0.25, 0.25, { gapi.kernel_package }) 0.118 0.122 0.97
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 1280x720, 0.25, 2, { gapi.kernel_package }) 0.844 0.842 1.00
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 1280x720, 2, 0.5, { gapi.kernel_package }) 1.599 1.594 1.00
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 1280x720, 2, 0.25, { gapi.kernel_package }) 0.794 0.797 1.00
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 1280x720, 2, 2, { gapi.kernel_package }) 6.512 6.520 1.00
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 1920x1080, 0.5, 0.5, { gapi.kernel_package }) 1.008 1.035 0.97
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 1920x1080, 0.5, 0.25, { gapi.kernel_package }) 0.477 0.476 1.00
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 1920x1080, 0.5, 2, { gapi.kernel_package }) 3.821 4.011 0.95
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 1920x1080, 0.25, 0.5, { gapi.kernel_package }) 0.596 0.605 0.98
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 1920x1080, 0.25, 0.25, { gapi.kernel_package }) 0.267 0.274 0.98
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 1920x1080, 0.25, 2, { gapi.kernel_package }) 2.046 2.216 0.92
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 1920x1080, 2, 0.5, { gapi.kernel_package }) 3.697 3.834 0.96
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 1920x1080, 2, 0.25, { gapi.kernel_package }) 1.819 1.818 1.00
TestPerformance::ResizeFxFyPerfTestFluid/ResizeFxFyPerfTest::(compare_f, 32FC1, 1, 1920x1080, 2, 2, { gapi.kernel_package }) 14.901 14.888 1.00
TestPerformance::ResizeInSimpleGraphPerfTestFluid/ResizeInSimpleGraphPerfTest::(compare_f, 32FC1, 128x128, 0.5, 0.5, { gapi.kernel_package }) 0.026 0.024 1.07
TestPerformance::ResizeInSimpleGraphPerfTestFluid/ResizeInSimpleGraphPerfTest::(compare_f, 32FC1, 640x480, 0.5, 0.5, { gapi.kernel_package }) 0.278 0.272 1.02
TestPerformance::ResizeInSimpleGraphPerfTestFluid/ResizeInSimpleGraphPerfTest::(compare_f, 32FC1, 1280x720, 0.5, 0.5, { gapi.kernel_package }) 0.809 0.792 1.02
TestPerformance::ResizeInSimpleGraphPerfTestFluid/ResizeInSimpleGraphPerfTest::(compare_f, 32FC1, 1920x1080, 0.5, 0.5, { gapi.kernel_package }) 2.246 2.214 1.01
TestPerformance::ResizePerfTestFluid/ResizePerfTest::(compare_f, 32FC1, 1, 128x128, 30x30, { gapi.kernel_package }) 0.005 0.006 0.96
TestPerformance::ResizePerfTestFluid/ResizePerfTest::(compare_f, 32FC1, 1, 128x128, 64x64, { gapi.kernel_package }) 0.011 0.012 0.97
TestPerformance::ResizePerfTestFluid/ResizePerfTest::(compare_f, 32FC1, 1, 640x480, 30x30, { gapi.kernel_package }) 0.007 0.007 0.98
TestPerformance::ResizePerfTestFluid/ResizePerfTest::(compare_f, 32FC1, 1, 640x480, 64x64, { gapi.kernel_package }) 0.014 0.014 0.97
TestPerformance::ResizePerfTestFluid/ResizePerfTest::(compare_f, 32FC1, 1, 1280x720, 30x30, { gapi.kernel_package }) 0.008 0.008 0.99
TestPerformance::ResizePerfTestFluid/ResizePerfTest::(compare_f, 32FC1, 1, 1280x720, 64x64, { gapi.kernel_package }) 0.018 0.017 1.10
TestPerformance::ResizePerfTestFluid/ResizePerfTest::(compare_f, 32FC1, 1, 1920x1080, 30x30, { gapi.kernel_package }) 0.009 0.009 0.99
TestPerformance::ResizePerfTestFluid/ResizePerfTest::(compare_f, 32FC1, 1, 1920x1080, 64x64, { gapi.kernel_package }) 0.036 0.021 1.76
The notable change can be seen only for the last case for 1080p to 64x64 resize. Which (in F32) may be a rare case. Given this, and given that there were no updates for Anna for the last year, and given that there's a lot of merge conflicts now between this branch and 4.x, I propose to close this PR for now.