rav1e icon indicating copy to clipboard operation
rav1e copied to clipboard

chroma_offset for 4:4:4 looks suspicious

Open kornelski opened this issue 8 months ago • 2 comments

Q_MODEL_MUL and Q_MODEL_ADD is used only for 8-bit depth. This seems odd, because perceptual effects of chroma's spatial resolution are not dependent on the bit depth.

The adjustment is not used for intra frames either. I'm not sure why. It would be nice to have appropriate chroma quality for still AVIF images.

https://github.com/xiph/rav1e/blob/c247d53ae43dd1312dbd90117c45e4c0ee6b06ce/src/rate.rs#L534-L538

I suspect that the slopes and bases for 4:4:4 may be overfitting something, or lack clamping before the calculation:

https://github.com/xiph/rav1e/blob/c247d53ae43dd1312dbd90117c45e4c0ee6b06ce/src/rate.rs#L510-L523

That's because in dc_qi and ac_qi, u can end up having higher qi than y:

 QuantizerParameters {
    log_base_q: 596939234966375734,
    log_target_q: 540310075640714472,
x 10 bit 444 8 bit 444 10 bit 420 8 bit 420
dc_qi [110, 131, 108] [110, 132, 108] [110, 101, 75] [110, 101, 73]
ac_qi [99, 118, 97] [98, 118, 97] [99, 90, 65] [98, 88, 61]
lambda 20.83157578250051 21.039473974368523 20.83157578250051 21.039473974368523
dist_scale [1.0, 0.54229736328125, 1.0629119873046875] [1.0, 0.5428314208984375, 1.0639495849609375] [1.0, 1.2977294921875, 2.5435638427734375] [1.0, 1.3011627197265625, 2.5502777099609375]

kornelski avatar Jun 24 '25 00:06 kornelski

@barrbrain Is this behavior correct? Is there a reason for ac_qi for luma to be so low compared to chroma for 444?

kornelski avatar Jul 08 '25 01:07 kornelski

I am seeing significantly worse compression ratio compared to libaom for 4:4:4, likely due to this issue. For still AVIF images I am seeing consistently worse SSIM at up to 20% higher file size, even at very high effort values. I suspect this is caused by this issue because the bit allocation in rav1e is very different from libaom for the affected images.

I've done a bit of digging to understand where these particular parameters and heuristics came from.

Q_MODEL_MUL and Q_MODEL_ADD were introduced in https://github.com/xiph/rav1e/pull/2497 and were backed by research in a Jupyter notebook which that PR also added. The parameters were chosen based on an experiment on 4 video files.

It does explain why the same logic is not used for intra frames:

We can repeat the same steps for intra frames. However, testing showed that using the derived linear models lead to worse encoding results.

However, no rationale for restricting these parameters to 8-bit depth is provided in the PR or the notebook.

The current heuristics in chroma_offset() were introduced in https://github.com/xiph/rav1e/pull/2272 along with // TODO: Optimal offsets for more configurations than just BT.709 and were optimized for PSNR rather than a perceptual metric such as SSIM. The optimization process that was not disclosed in detail, and was not accompanied by a Jupyter notebook. Based on the comments on the PR, the choice of these values did not seem very rigorous. It also predates the introduction of Q_MODEL_MUL and Q_MODEL_ADD .

Shnatsel avatar Sep 27 '25 01:09 Shnatsel