FFmpeg
FFmpeg copied to clipboard
vvc_deblock: add AVX2-accelerated chroma deblock filter
[refact] tests/checkasm/checkasm --test=vvc_deblock --benchmark deblock_chroma
benchmarking with native FFmpeg timers
nop: 48.9
checkasm: using random seed 189563177
AVX2:
- vvc_deblock.chroma [OK]
checkasm: all 4 tests passed
vvc_h_loop_filter_chroma8 weak_c: 50.7
vvc_h_loop_filter_chroma8 weak_avx2: 14.7
vvc_h_loop_filter_chroma10 weak_c: 50.7
vvc_h_loop_filter_chroma10 weak_avx2: 5.7
vvc_v_loop_filter_chroma8_c: 41.7
vvc_v_loop_filter_chroma8_avx2: 32.7
vvc_v_loop_filter_chroma10_c: 41.7
vvc_v_loop_filter_chroma10_avx2: 32.7
Currently does not pass most conformance tests, which suggests an issue with both my code and my checkasm impl.
Benchmark results
| File | C | AVX2 |
|---|---|---|
| BQTerrace_1920x1080_60_10_420_22_RA.vvc | 74.0 | 76.3 |
| RitualDance_1920x1080_60_10_420_37_RA.266 | 119.7 | 131.3 |
| RitualDance_1920x1080_60_10_420_32_LD.266 | 120.0 | 123.7 |
| Chimera_8bit_1080P_1000_frames.vvc | 155.0 | 156.3 |
| Tango2_3840x2160_60_10_420_27_LD.266 | 31.3 | 32.7 |
| NovosobornayaSquare_1920x1080.bin | 151.7 | 151.7 |
[FFmpeg] tests/checkasm/checkasm --test=vvc_deblock --benchmark
benchmarking with native FFmpeg timers
nop: 49.2
checkasm: using random seed 1714318218
AVX2:
- vvc_deblock.chroma [OK]
checkasm: all 6 tests passed
vvc_h_loop_filter_chroma_8_c: 50.7
vvc_h_loop_filter_chroma_8_avx2: 14.7
vvc_h_loop_filter_chroma_10_c: 41.7
vvc_h_loop_filter_chroma_10_avx2: 5.7
vvc_h_loop_filter_chroma_12_c: 41.7
vvc_h_loop_filter_chroma_12_avx2: 14.7
vvc_v_loop_filter_chroma_8_c: 50.7
vvc_v_loop_filter_chroma_8_avx2: 23.7
vvc_v_loop_filter_chroma_10_c: 41.7
vvc_v_loop_filter_chroma_10_avx2: 23.7
vvc_v_loop_filter_chroma_12_c: 41.7
vvc_v_loop_filter_chroma_12_avx2: 32.7
I have no clue why the test is failing here. For reference, the checkasm test passes fully on my Arch Linux system. (checkasm will not benchmark if it fails)