penguinV
penguinV copied to clipboard
Support of AVX-512
More modern CPUs can support AVX-512 so why don't use it? We don't have CPUs with AVX-512 support yet :(
@0x72D0 do you want any chance to obtain a hardware (CPU) for this? :)
yeah sure, but I don't have the motherboard and all the other hardware for it so...
and also I don't know if I'm gonna have a lot of time with my university...
Ah, I see :( Education is very important thing in this world ;)
Anyway, if you would have a time and willingness then you might consider an opportunity to join this project as a collaborator :) You could read what it means on github help.
Hey @0x72D0 would you mind to come back this issue once you have enough free time? You don't need to have a processor with AVX-512 but a compiler to do such. Let's make a blind step to implement AVX-512 without proper hardware which could test in future.
Hi @ihhub, I willl try this after #431
Hi @ihhub I was thinking of doing this issue for the hacktoberfest :). What extension of AVX512 should we support? I think it's easier to target a specific cpu architecture since there's so many of them. I was thinking about F, CD, VL, DQ, BW since it's the avx512 extension used by the skylake architecture. Or we could target the icelake architecture with F, CD, VL, DQ, BW, IFMA, VBMI, VBMI2, VPOPCNTDQ, BITALG, VNNI, VPCLMULQDQ, GFNI, VAES
Hi @0x72D0 , the choice is yours. The most important is that compiler is able to support to it. I can help in this issue as much as I can!
Ok this table found on wikipedia:
show that all AVX-512 processor use F and CD. Since most of the 8-bit simd operation we use in this project appear in BW and BW always come with VL and DQ, I think we should taget F, CD, BW, VL and DQ.
@ihhub I'm gonna implement three function only by PR to avoid a situation like #434
I tried sde like proposed in #611 with the current master branch (81379fc43e35fd1e509cef7a3b72aea7bbffeb82) and I got the following failed unit_tests:
I've launched sde with the following command: sde -- ./unit_tests on linux 4.19 (manjaro)
Is it correct to say that all tests with AVX-512 support fail?
there's might be an error with the way I launch sde, I don't have any error with function_pool::Threshold while running with -march=native
Thread pool uses the code from penguinv.h
file which internally calls SIMD code first and then normal CPU code. Setting -march=native
means we enable all possible instructions available on CPU.
@ihhub No, the Maximum, Minimum, AbsoluteDifference, Accumulate, etc. work
@ihhub there's no avx512 function for the Threshold function
Let's analyse functions which are implemented by AVX-512. If some functions are passed it might be that we made a mistake in other functions.
Let's ignore Threshold function for now and try to fix the rest of functions.
What I'm trying to say is maybe there's error with the way I launch sde, because function that's suppose to work doesn't
@0x72D0 I understand. Let's try this way: comment all functions in function_pool
namespace in unit_test_image_function.cpp file, lines 2574 - 2596. Thereafter, please try to run unit tests again to verify your idea.
Hi @0x72D0 could you please try to run unit tests with latest code as I added some missing code fro AVX-512?
Explanation behind failed functions: in unit tests we disable all instruction sets except specific one. In this case we enabled AVX-512 only but we didn't have code in SIMD common function to run AVX-512 code. As a result the function didn't do anything on output image causing tests to fail.