image icon indicating copy to clipboard operation
image copied to clipboard

HDR gain maps

Open neckaros opened this issue 10 months ago • 9 comments

Now that this library can manage additional metadata like ICC profile and orientation. What is the the strategy on gain maps / transfer functions ? They are more and more used with the advance of HDR capable display and now most phones will take HDR photo and save them using gain maps for retro-compatibily on SDR screens

For some context here is the discussion for the implementation that has been done in ImageMagick: https://github.com/ImageMagick/ImageMagick/issues/6377

Features that could be implemented:

  • Like for ICC profile: get and set hdr gain maps (each encoder / decoder will need to implement it)
  • allow conversion of sdr image to HRD using a gain map as input

neckaros avatar Feb 01 '25 15:02 neckaros

Here is a few issues with gain maps, at least with math:

  • libultrahdr has its own gain map
  • AVIF has its own gain map
  • Apple and Adobe are developing an ISO standard for their own gain map
  • Upcoming Apple and Adobe ISO standard is behind a paywall: https://www.iso.org/standard/86775.html. Since the official specification is not freely available, reverse engineering or referencing Apple's publicly documented approach (Apple HDR) may be necessary.
  • AVIF gain maps are highly complex and require full YUV and scaling support.
  • Fast scaling is necessary for all gain map implementations ( Apple and Adobe stores gain maps as Y plane with 1/4 image size ), at least a hardcoded bilinear upscaler from 1/4 to full size is required.
  • CMS (at least basic CMS) is required, because gain map and associated images are stored usually with different transfer function, and color primaries than what most of displays require. Built-in color transformation is necessary.
  • In AVIF, gain maps can have different color primaries, matrices, and transfer functions compared to the main image
  • Some implementations also allow have separate ICC profile on gain map

awxkee avatar Feb 20 '25 17:02 awxkee

Recently I've read about difficulties handling HDR images in PQ and HLG. I hoped that gain maps would be a simpler alternative, but I see here that it's more of the same :)

kornelski avatar Feb 21 '25 23:02 kornelski

I recently ported my HDR tonemapping to Rust, you could copy/use it https://github.com/awxkee/gainforge. It is actually not really such complex, at least until you want to use ITU-R Rec. 2408. When you don’t need tonemapping, you will probably use a HAL surface from Vulkan or Metal, or something similar, where everything is already handled for you.

Atm, I think AVIF gain maps are more overcomplicated than others. They can have their own transfer function, matrix coefficients, ICC profile, and primaries, can be any size, and may contain either 1 or 3 channels, and finally- stores in YUV.

I believe Apple gain map is an easy one ( at least there is only scaling is required and math is simple ), but specification is behind a paywall.

I looked at UHDR and there is something more close to the thing that Apple does use. Gain map stored in some ISO metadata. If any of you can take a look at this and explain how to actually extract it, without building libjpeg, I may check the implementation further.

awxkee avatar Feb 21 '25 23:02 awxkee

This is in JPEG app marker 2, if the marker data starts with 'u', 'r', 'n', ':', 'i', 's', 'o', ':', 's', 't', 'd', ':', 'i', 's', 'o', ':', 't', 's', ':', '2', '1', '4', '9', '6', ':', '-', '1', '\0',. If the data is longer than 64KB, concatenate these together. They may have a prefix for sorting them (see how ICC_PROFILE does it).

Any JPEG decoder should give you the markers. If you must, you can parse the JPEG structure yourself, since it's very basic: 0xFF, a byte for the marker type (0xE2 is app marker 2), 2 big-endian bytes for marker length, and repeat. Stop at FF D8, because that's the image data without defined length.

kornelski avatar Feb 22 '25 00:02 kornelski

Thanks @kornelski for pointing out how to handle that!

Here is file structure of UHDR what is assumed. Thus, UHDR image is litteraly 2 jpegs concatenated together in one file. There are 2 possible metadata containers: XMP profile and APP2 urn:iso:std:iso:ts:21496:-1\0. However, when I tried to stop at FF D8 it always signals that APP2 ISO is not found. I tried to patch jpeg-decoder but it also never finds that chunk. So I had write this to extract gain map metadata. There is also two APP2 urn:iso:std:iso:ts:21496:-1\0 markers, one in the primary image, the second one in secondary image, and only the second one contains the metadata itself.

There is also something incorrect in jpeg-decoder because it is unable to decode the secondary image, whereas libjpeg-turbo and zune-jpeg can handle it.

I also had to patch jpeg-decoder to expose the reader stream as public in order to find the end of the primary image. At the moment, there are no Rust decoders that allow handling multiple images in a single file in a straightforward way.

Gainmap math itself is simple. However, a CMS is required. UHDR assumes that the image has one of the following color spaces: Display P3, BT.2020, or sRGB, whereas most renderings will use sRGB. Therefore, having at least a basic CMS is necessary because the math should be applied in a linear color space, and the gain map may have a different color space than primary image. As a result, a conversion from the gain map's ICC profile to the primary image's ICC profile might be needed. Therefore, I implemented a basic CMS parser that can perform transformations in the XYZ data connection space for RGB. It might not be so basic after all—if there is interest, we could made it in another repo and fuzz it on RGB profiles, add grayscale conversions, and make it capable of handling a vast number of existing ICC profiles.

Here is method of applying gain map. It is mostly the same as tone mapping.

If someone wants to try it out here is an app. It may not work :)

In summary: A patch in zune-image is required to extract the necessary markers and XMP. Additionally, a patch is needed at some layer to determine the number of images in the stream and check if a gain map is present. CMS is required. Scaling is probably not required specifically for UHDR. It seems to assume that the gain map is the same size as the primary image. For writing/reading gainmaps metadata in XMP XML parser will be required also.

It is beyond my knowledge to handle correctly all this JPEG related part. If there is interest and someone will handle the JPEG part, I can handle the CMS and math. Until then, if anyone needs it, they can use my developments.

awxkee avatar Feb 24 '25 23:02 awxkee

Your links to jpeg-decoder seem broken, so I don't see what changed. I guess if it stores the whole image, it will store it similarly to JPEG thumbnails, and that means you get a Start-Of-Image marker without a length, and need to scan the data for 0xFFs to find the end, but you've probably figured that out.

Have you really written your own CMS for this!?

There's https://lib.rs/qcms from Firefox. I maintain lcms2 bindings https://lib.rs/lcms2, and although that is C, I found it pretty easy to build and stable.

kornelski avatar Feb 25 '25 15:02 kornelski

Have you really written your own CMS for this!?

lcms2 might be adopted for this case. I never did exactly this through lcms2, but it is likely possible. However, this require to add dependency that is not really safe.

qcms is limited to 8-bit processing and doesn't expose its math externally. As a result, It is not useful for HDR which is usually at least 10 bit and require to do transform externally.

Technically writing CMS for HDR only is easy, because only parser and RGB colorspaces connection through XYZ is required. As for now, HDR goes only in Display P3, Bt.2020 and sRGB sometimes. And I had most of math already.

Your links to jpeg-decoder seem broken, so I don't see what changed

I was lazy enough to make a real patch :) I've added https://github.com/awxkee/gainforge/blob/282ade44f4aead1d6dd04b471fa28a2ed4373aae/app/jpeg-decoder/src/parser.rs#L812. This never finds the correct marker; it might be a bit messy, but jpeg-decoder only finds XMP APP markers. And it is not able to decode the gainmap image itself. And I've made reader field public, to find the end of the image https://github.com/awxkee/gainforge/blob/282ade44f4aead1d6dd04b471fa28a2ed4373aae/app/jpeg-decoder/src/decoder.rs#L102

Actually image reading happens here

awxkee avatar Feb 25 '25 16:02 awxkee

We've handled all JPEG markers in zune-jpeg. Here is some info if someone wants to see how it looks. In addition to what I wrote before, an XML serializer is also required.

All of this was expected to be easier than the PQ/HLG curve, but in fact, it's all overcomplicated.

awxkee avatar Mar 02 '25 19:03 awxkee

Have you really written your own CMS for this!?

There's https://lib.rs/qcms from Firefox. I maintain lcms2 bindings https://lib.rs/lcms2, and although that is C, I found it pretty easy to build and stable.

This is CMS for this. Slightly more work than expected but should handle all these cases, and most of common web images.

awxkee avatar Mar 11 '25 20:03 awxkee