KTX-Software icon indicating copy to clipboard operation
KTX-Software copied to clipboard

Improve handling of perceptual encoding metrics

Open lexaknyazev opened this issue 3 years ago • 10 comments

Basis encoder has a "perceptual" flag. When enabled, it assigns uneven weights to red, green, and blue channels for computing the error metric. By the way, only ETC1S is affected by that for now.

Ideally, the settings should be more flexible:

  • compute metrics in linear / non-linear space (i.e. decode 8-bit sRGB to real values), not implemented yet;
  • set per-channel weights (https://github.com/BinomialLLC/basis_universal/issues/202):
    • even;
    • luma, should depend on the color primaries:
      • Rec. 709: [0.2126, 0.7152, 0.0722] (matches the current "perceptual")
      • Rec. 2020: [0.2627, 0.6780, 0.0593]
      • ...
    • custom (e.g. to ignore unused channels).

Currently, the perceptual flag is tied to the source transfer function which is a bit misleading.

/cc @zeux

lexaknyazev avatar Apr 08 '21 19:04 lexaknyazev

By the way, only ETC1S is affected by that for now.

This should affect UASTC as well wrt mipmap generation presumably, even if the error metric used by the encoder is flat.

zeux avatar Apr 08 '21 22:04 zeux

Currently, the perceptual flag is tied to the source transfer function which is a bit misleading.

What do you want to happen? sRGB is perceptual. Linear is not. So how else to select use of the encoder's perceptual flag?

It sounds like you want options to set the per-channel weights. You want these in the ktxBasisParams struct and toktx?

What metrics are you talking about? toktx currently doesn't compute any metrics. Are you saying you want KTX-Software to expose an option to specify in which space the Basis encoder computes its error metrics?

This should affect UASTC as well wrt mipmap generation

???

Filtering for mipmap generation is done in linear space.

MarkCallow avatar Apr 09 '21 00:04 MarkCallow

Filtering for mipmap generation is done in linear space.

Yes, which is controlled by --linear - or at least used to be controlled by that flag, presumably it's now controlled by a new flag.

zeux avatar Apr 09 '21 00:04 zeux

--linear, now --assign_oetf linear overrides whatever information the input image provides about its oetf. It is the information from the input image, possibly overridden, that controls whether sRGB decoding and re-encoding is done during mipmap generation.

As far as I can see, this is independent of @lexaknyazev's request.

MarkCallow avatar Apr 09 '21 01:04 MarkCallow

sRGB is perceptual. Linear is not. So how else to select use of the encoder's perceptual flag?

Since the currently used perceptual weights are hard-coded to Rec. 709 in the encoder, the perceptual flag makes less sense in a case when different primaries are used.

Once the encoder starts accepting custom channel weights, KTX-Software should just set them directly based on actual primaries instead of using that perceptual flag.

Are you saying you want KTX-Software to expose an option to specify in which space the Basis encoder computes its error metrics?

Yes, once the encoder supports this option. It's not there yet.

lexaknyazev avatar Apr 09 '21 05:04 lexaknyazev

Looking into this again, I think we do not need physically-linear metric for sRGB-encoded data, because the perceptual luma (Y') is by definition a weighted sum of non-linear values.

lexaknyazev avatar Apr 10 '21 13:04 lexaknyazev

PR #534 adds a --astc_perceptual flag for the astc encoder. Here's the doc:

The codec should optimize for perceptual error, instead of direct RMS error. This aims to improves perceived image quality, but typically lowers the measured PSNR score. Perceptual methods are currently only available for normal maps and RGB color data.

Since this is not tied to sRGB inputs it seems to me different from perceptual in the BasisU encoder as the latter is tied to sRGB, so we made it astc-specific. @lexaknyazev please take a look and answer the following: is this actually similar to what you are requesting here for the BasisU encoder, minus being able to specify the weights?

Looking into this again, I think we do not need physically-linear metric for sRGB-encoded data, because the perceptual luma (Y') is by definition a weighted sum of non-linear values.

Please explain. Are you saying perceptual mode should only be applied to non-sRGB inputs?

MarkCallow avatar Feb 20 '22 07:02 MarkCallow

There are two different aspects to perceptual metrics - chroma bias, and gamma bias.

The current --astc_perceptual mode only corrects for chroma bias, and should be valid for both sRGB and linear encodes.

The gamma bias is something I'm not yet sure how to correct for (or even if it's desirable to do so).

For sRGB textures the gamma-corrected luminance curve means that an error of X should have the same weight no matter the absolute luminance involved, so I think it's safe to ignore gamma bias.

For linear textures that store color, the significance of an error X depends on the absolute luminance you are starting with, as it is obviously not following the perceptual curve. I think the standard fix for this is using Y'CbCr for error calculations, as this can correct for both the gamma curve and the luma-chroma perceptual significance differences. However, if you care enough about gamma correcting your errors you should probably really be using sRGB textures for color data anyway, so I'm in two minds about whether this really has much value. There doesn't seem much point using a gamma corrected error metric in the compressor only to stuff the compressed data into a non-gamma corrected encoding.

solidpixel avatar Aug 17 '22 21:08 solidpixel

Ping @lexaknyazev. Please answer the questions in my comment and let us know if you have any thoughts about what @solidpixel wrote.

MarkCallow avatar Aug 19 '22 03:08 MarkCallow

is this actually similar to what you are requesting here for the BasisU encoder, minus being able to specify the weights?

Yes.

Are you saying perceptual mode should only be applied to non-sRGB inputs?

Technically, this is covered by the @solidpixel's comment above. We do need configurable chroma bias everywhere, regardless of transfer functions (think raw float16 to ASTC HDR path).

Correcting the gamma bias (i.e., decoding 8-bit sRGB to linear floats before computing the error) may even produce worse final results because errors in lower values are usually more important than errors in upper values for color data.

lexaknyazev avatar Aug 19 '22 12:08 lexaknyazev