SVT-VP9 icon indicating copy to clipboard operation
SVT-VP9 copied to clipboard

Platform detection incorrect for AVX2 support

Open techvintage opened this issue 6 years ago • 7 comments

The current code does not detect platform correctly for AVX2 support and hence uses slower functions on non-Intel platforms.This fix causes AVX2 related functions to be invoked correctly and giving the required 2.5X+ speed-up on AMD and possibly other platforms which support AVX2.

techvintage avatar Jun 06 '19 03:06 techvintage

Hi @techvintage Thanks for this PR. Instead of hard code, could you make the change in can_use_intel_avx512()? You can change the name of this function too.

tianjunwork avatar Jun 06 '19 22:06 tianjunwork

Hi @tianjunwork ,this is not hardcoded but changing asm_type to highest level (=7) to begin with.If AVX2 functionality is not supported,it starts downgrading and uses corresponding slower functions instead.This is opposite approach to setting asm_type=0 (lowest value) and increasing it incrementally and fixes the bug.

techvintage avatar Jun 07 '19 19:06 techvintage

Hi @techvintage, I don't get it, if the logic after uint32_t asm_type = 7; works as expected on your platform, asm_type will be reassigned to a new value. How does your patch work?

tianjunwork avatar Jun 10 '19 16:06 tianjunwork

So can_use_intel_avx512() and can_use_Intel_4thGen_features function do not work correctly on non-Intel Platforms and proceeds with asm_type=7 and picks up the faster functions on AMD platforms correctly.We saw a 2.5X+ gain with this on Matisse with this. Also, I see a FFmpeg patch failing on travis CI,My travis CI is synced to github but looks like cloning ffmpeg git repo timed out.Do we developers have a way of triggering the build again?

techvintage avatar Jun 10 '19 16:06 techvintage

I don't have a non-intel platform so don't know how these two functions behave. You mean can_use_intel_avx512() doesn't work but still return 1? Otherwise it runs to "asm_type = 1;". I re-run the travis task. It is passed now.

tianjunwork avatar Jun 10 '19 18:06 tianjunwork

Yes,that's correct.So if we start with asm_type=7(highest),If can_use_intel_avx512() returns true for Intel platforms,it will be still at 7 else will start lowering asm_type based on support available.

techvintage avatar Jun 11 '19 02:06 techvintage

If can_use_intel_avx512() still returns 1 for non-intel platform, why do you need to set asm_type=7 at the beginning? Intel platform works just fine. Your goal is to handle non-intel platform, right? If so, instead of letting non-working can_use_intel_avx512() and can_use_intel_core4th_gen_features() mess up the logic. Can we make these two function more generic?

tianjunwork avatar Jun 11 '19 05:06 tianjunwork