Add AVX2 implementation of deblocking filter / port deblocking asm from HEVC
- Port as much for HEVC deblock asm to VVC code as possible
- Add VVC specific deblocking code
References
Chroma Deblocking Notes
VVC has three types of Chroma deblocking filter operations:
- long-tap filter
FUNC(loop_filter_chroma_strong) - short-tap filter ``FUNC(loop_filter_chroma_weak)` (equivalent to the HEVC chroma deblocking)
- long-tap filter with reduced support
FUNC(loop_filter_chroma_strong_one_side)
The filter choice depends on the CU/TU block side length (ortho to boundary being deblocked):
- Upper side of horizontal can be restricted to ($S_p = 1$), otherwise
- if block side $\geq 8$ samples, then $Sp, Sq = 3$, otherwise
- $Sp, Sq = 1$
When $Sp = Sq = 1$ the short-tap (chroma_weak) is applied. This filter is identical to the (only) HEVC deblocking filter:
$$ \Delta_c = \left((q_0 - p_0 << 2) + p_1 - q_1 + 4\right) >> 3 $$
One general difficult with incremental implementation is that ff_vvc_deblock_horizontal calls a single function vvcdsp.lf.filter_chroma which implements the filter decision calculation -> filtering. If I can at least temporarily do some checks in ff_vvc_deblock_horizontal then I can call hevc's asm implementation directly.