Stone Chen
Stone Chen
Adds AVX2 assembly for SAD used in DMVR (decoder-side motion vector refinement). The main difference is that in VVC, SAD is only calculated on even rows of the PU to...
* Port as much for [HEVC deblock asm to VVC ](https://github.com/ffvvc/FFmpeg/blob/e81b6d78fc2ddf8edd53a6a052713354ef8d27c2/libavcodec/x86/hevc_deblock.asm)code as possible * Add VVC specific deblocking code References * [VVC In Loop filters paper](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9399506)
Based on [Add AVX2 assembly code for inter predict #51](https://github.com/ffvvc/FFmpeg/issues/51) DMVR (decoder-side motion vector refinement) computes SAD on PUs with the following constraints * w >= 8, h >= 8,...
Some of the ways I wrote the horizontal asm aren't compatible with vertical. Strong calculations currently stores certain calculations to free up registers for later use. This happens in the...
Extremely rough draft for implementing AVX2 support. Separated out the chroma and luma loop just for experimentation purposes. Unroll loop ` for (int x = x0 ? x0 : grid;...
Previously RANDCLIP(x, diff) was computing `x - diff` and then clipping it between (0, max_pixel_val + rnd() % 2 * diff). This means we're not really generating a random value...