rawspeed icon indicating copy to clipboard operation
rawspeed copied to clipboard

Add LJpeg predictors 2-7

Open kmilos opened this issue 3 years ago • 34 comments

Proof of concept (non-performant) to address #189 and #258

kmilos avatar Dec 30 '21 10:12 kmilos

I'm currently crossing all t's and dotting all i's so that #325 can happen.

LebedevRI avatar Dec 30 '21 11:12 LebedevRI

Codecov Report

Attention: Patch coverage is 12.85714% with 61 lines in your changes are missing coverage. Please review.

Project coverage is 60.76%. Comparing base (876c91f) to head (fb143a7). Report is 142 commits behind head on develop.

Files Patch % Lines
...rc/librawspeed/decompressors/LJpegDecompressor.cpp 10.60% 59 Missing :warning:
...zz/librawspeed/decompressors/LJpegDecompressor.cpp 0.00% 2 Missing :warning:
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #334      +/-   ##
===========================================
- Coverage    60.95%   60.76%   -0.20%     
===========================================
  Files          273      273              
  Lines        16408    16466      +58     
  Branches      2077     2078       +1     
===========================================
+ Hits         10002    10006       +4     
- Misses        6278     6332      +54     
  Partials       128      128              
Flag Coverage Δ
benchmarks 11.84% <0.00%> (-0.05%) :arrow_down:
integration 44.81% <15.00%> (-0.14%) :arrow_down:
linux 56.94% <13.04%> (-0.20%) :arrow_down:
macOS 25.20% <0.00%> (-0.07%) :arrow_down:
rpu_u 44.81% <15.00%> (-0.14%) :arrow_down:
unittests 21.55% <0.00%> (-0.08%) :arrow_down:
windows ∅ <ø> (∅)

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Dec 30 '21 11:12 codecov[bot]

For prediction > 4 and CFA images, it is common practice to pack two lines into one, so instead of:

RGRGRGRG
GBGBGBGB
RGRGRGRG
GBGBGBGB

the jpeg image becomes:

RGRGRGRGGBGBGBGB
RGRGRGRGGBGBGBGB

Because this pattern works better for prediction modes > 4 and component count = 2.

This is currently not handled by the code, as it is assumed that the dimension of the ljpeg stream is the same as the image/tile.

[rawspeed] (dummyreg5.dng) void rawspeed::AbstractDngDecompressor::decompress() const, line 217: Too many errors encountered. Giving up. First Error:
virtual void rawspeed::LJpegDecompressor::decodeScan(), line 112: LJpeg frame (1184, 79) is smaller than expected (592, 158)

I will generate you a set of DNG files with pred 1-7, comp=2 and packed lines later this day.

cytrinox avatar Jan 07 '22 09:01 cytrinox

My understanding is that this reshaping happens at a different level of abstraction, and that the number of SOF3 channels at this stage is already correct, and could indeed be different from the CFA channels (i.e. 1). This is how it already works in rawspeed for "normal" DNG lossless compression (2 channels in SOF3, pred 1)

See also my findings in https://github.com/darktable-org/rawspeed/issues/189

The current code works out if you're reshaping by widening (Adobe), but doesn't work if you're reshaping by squeezing (which is what Blackmagic do).

kmilos avatar Jan 07 '22 09:01 kmilos

In any case, I will not take this further (apart from fixing any overflows/underflows you can demonstrate), as it works on Apple's DNGs which I wanted to give people a potential patch to use on until Roman implements this in Halide.

kmilos avatar Jan 07 '22 09:01 kmilos

The current code works out if you're reshaping by widening (Adobe), but doesn't work if you're reshaping by squeezing (which is what Blackmagic do).

There are multiple options. First, use comp=1, then w/h of the ljpeg is identical to tile w/h. For CFA images, it's wise to use comp=2, then lpjeg width is divided by two (squeezing) without reducing the height. And for prediction 4-7, you pack two lines into one, leads to streching (width*2) and (height/2). You can use line packing with just one component, event four is save if the implementation respects it.

Adobe uses prediction 1 and comp=2 and because of that, the width is half but height is unchanged. BlackMagic uses comp=2, too, but because of the prediction mode, SOF3 width becomes (width/2)*2 and SOF3 height (height/2)

cytrinox avatar Jan 07 '22 09:01 cytrinox

BlackMagic uses comp=2

Blackmagic use comp=1 and 2H x W/2

kmilos avatar Jan 07 '22 09:01 kmilos

(FYI if it isn't obvious i'm in #pixls.us @ OFTC)

<...> until Roman implements this in Halide.

I'm basically having an existential crisis of conciseness here, and that has been happening for some time now. I really don't like how all of our (not just darktable-org's, but more in general in the whole greater open graphics ecosystem) image processing loops are just hacked together with shitty unreadable unmodifyable unperformant C-like spagetti. Moving to higher-level abstraction should be a huge win, but i'm afraid of being false messiah, and just making things worse.

LebedevRI avatar Jan 07 '22 09:01 LebedevRI

The sentiment is shared, I don't think it's ideal either. But as you already know there are a limited number of motivated/interested contributors who can properly deal w/ the C++ templated abstraction already laid out here and confident in what LLVM will compile that code to, imagine how many there will be proficient in Halide... In the meantime, the requests to support new codecs keep piling on :/

kmilos avatar Jan 07 '22 10:01 kmilos

BlackMagic uses comp=2

Blackmagic use comp=1 and 2H x W/2

Oh... I just assumed this because otherwise the rawspeed error in #189 wouldn't make sense. Let me check this deeper when I find some time, from the errors and width/height values reported it should be theoretically the same issue as with my line-packed dng files.

cytrinox avatar Jan 07 '22 10:01 cytrinox

The error in #189 is there because it precisely assumes 2 components so W/2 * 2 = W (or in general it assumes you'll always have W/N *N comp = W). Blackmagic don't do that (I guess because of the predictor), so that assumption brakes down.

Mind you, we also have comp = 4 in the Sony case, but I haven't been motivated to do the plumbing for that one, which should work out hopefully better than the Blackmagic case.

kmilos avatar Jan 07 '22 10:01 kmilos

(FYI if it isn't obvious i'm in #pixls.us @ OFTC)

<...> until Roman implements this in Halide.

I'm basically having an existential crisis of conciseness here, and that has been happening for some time now. I really don't like how all of our (not just darktable-org's, but more in general in the whole greater open graphics ecosystem) image processing loops are just hacked together with shitty unreadable unmodifyable unperformant C-like spagetti. Moving to higher-level abstraction should be a huge win, but i'm afraid of being false messiah, and just making things worse.

The sentiment is shared, I don't think it's ideal either. But as you already know there are a limited number of motivated/interested contributors who can properly deal w/ the C++ templated abstraction already laid out here and confident in what LLVM will compile that code to, imagine how many there will be proficient in Halide... In the meantime, the requests to support new codecs keep piling on :/

Let me put it it this way. While i have not made a decisions yet, i am starting to believe that this is a "survival of the fittest" question. As in, either our ecosystem does make that evolutionary jump, or it does deserve to go extinct.

LebedevRI avatar Jan 07 '22 10:01 LebedevRI

Here is a set of DNG files with various predictors: https://chaospixel.com/pub/temp/dng_pred_tests.tar.bz2

cytrinox avatar Jan 07 '22 13:01 cytrinox

@kmilos Just for fun, I've looked into https://github.com/yanburman/dng_sdk/blob/master/source/dng_lossless_jpeg.cpp#L2501 and https://github.com/yanburman/dng_sdk/blob/master/source/dng_lossless_jpeg.cpp#L1428 which comes from the Adobe DNG SDK lossless decoder. There is a special case when Px = Ra + ((Rb – Rc)/2) leads to Ra + (-1 / 2) which is different to Ra + (-1 >> 1) and I wanted to know how Adobe solved this. They just use signed integers and doing a right shift, which leads to -1 >> 1 = -1.

cytrinox avatar Jan 07 '22 14:01 cytrinox

Yet more spaghetti, but I believe this is now functionally correct for the intermediate predictor calculations.

kmilos avatar Jan 07 '22 14:01 kmilos

Btw, the Sony case won't work neither w/o some extra reshaping because we're not in the W/4 x 4 comp arrangement assumed, but H/2 x W/2 x 4.

kmilos avatar Jan 07 '22 15:01 kmilos

FWIW, i've been thinking about this, and i suppose i know a way to implement the desired support without hurting the current cases.

I haven't checked, but i'm not sure we have all the necessary samples on RPU though -- i suspect we only have the ones for predictors 2 and 7. (ignoring the obvious predictor 0), and for sure i don't expect that we have the full permutation matrix of all 4 variants of component per pixel (1, 2, 3, 4).

So essentially we need at least 8*4 samples. (i may be forgetting some other permutation). It's obviously not fatal, it just requires someone to write a testcase generator for them.

LebedevRI avatar Jan 30 '23 01:01 LebedevRI

So essentially we need at least 8*4 samples. (i may be forgetting some other permutation).

Right, restart interval.

LebedevRI avatar Jan 30 '23 02:01 LebedevRI

I haven't checked, but i'm not sure we have all the necessary samples on RPU though -- i suspect we only have the ones for predictors 2 and 7.

Blackmagic DNGs have predictor 6, but have another problem (component arrangement as mentioned above and in https://github.com/darktable-org/rawspeed/issues/189) preventing its testing...

So it might be an idea to generalize the component arrangement first (1x1, 1x2, 2x1, 2x2), thus ticking Sony lossless off the list?

Or perhaps you prefer to handle both component arrangement and predictors in one go? I hope @cytrinox could help with the 4*8 test vectors then... 🙏

kmilos avatar Jan 30 '23 06:01 kmilos

So it might be an idea to generalize the component arrangement first (1x1, 1x2, 2x1, 2x2), thus ticking Sony lossless off the list?

Sony FUBAR'ed their "LJpeg" implementation to the point of it no longer being a proper LJpeg. We won't be able to support it in a generic LJpeg decompressor.

LebedevRI avatar Jan 30 '23 13:01 LebedevRI

to the point of it no longer being a proper LJpeg

Why do you think so? AFAICT it is, just that the component scatter/gather is different from the Adobe one... I.e. for a 2x2 Bayer CFA Adobe does 2 components like this

1 2 1 2 ... 1 2 1 2 ...

While Sony does 4 components like this:

1 2 1 2 ... 3 4 3 4 ...

Each component in both Adobe and Sony cases seems to be regular LJpeg SOF3 (w/ already supported predictor 1)... The arrangement of components has nothing to do w/ LJpeg itself, and current LJpeg code seems to work fine once the arrangement is taken into account..

What they FUBAR'ed is on the TIFF container level by having both Tile* and Strip* tags simultaneously, and making ImageWidth/Height an integer of tile size, when it doesn't have to be (which is why I advocated for the absolute crop for these models covering all modes instead of a relative, negative one; but this then makes APS-C crop more complicated so it's a lose-lose either way)...

kmilos avatar Jan 30 '23 13:01 kmilos

And BlackMagic do something even weirder, from a 2x2 Bayer CFA:

1 1 ... 1 1 ...

to a single components LJpeg like this:

1 1 1 1

(w/ predictor 6)

kmilos avatar Jan 30 '23 14:01 kmilos

and current LJpeg code seems to work fine once the arrangement is taken into account..

That's my point exactly actually. As long as that sonyArrange_ can't be deduced from the LJpeg container itself, it's broken.

LebedevRI avatar Jan 30 '23 15:01 LebedevRI

As long as that sonyArrange_ can't be deduced from the LJpeg container itself, it's broken.

Sure, and I already mentioned there that should be generalized - one just needs to parse the SOF3 header for the no. of components and their frame size, and compare to the TIFF CFA tile size, and that gives the arrangement:

E.g.

Sony case: TIFF tile is 512x512, SOF3 is 4 components of 256x256

Adobe case: TIFF tile is 256x256, SOF3 is 2 components of 256x128

BlackMagic case: TIFF tile is 512x512, SOF3 is 1 component of 256x1024

The arrangement is calculated from the TIFF tile HxW divided by SOF3 YxX, and to check conformance, one must have N_comp*Y*X = H*W (I think already checked).

Both the Sony and BlackMagic SOF3 headers are correct in what they implement AFAICT, so nothing is really at odds here, "just" missing implementation in rawspeed...

kmilos avatar Jan 30 '23 15:01 kmilos

I see. So for cpp=2, we have two possible layouts:

x x

and

x
x

and for cpp=4 we have three:

x x x x
x x
x x
x
x
x
x

.. that multiplies the needed sample set by another x2 or so.

LebedevRI avatar Jan 30 '23 18:01 LebedevRI

that multiplies the needed sample set by another x2 or so

😮

kmilos avatar Jan 30 '23 19:01 kmilos

and for cpp=4 we have three

Not quite. For cpp=4 you can have two:

1 2 3 4

or

1 3 2 4

i.e. row or column major scatter of the 2x2 Bayer into 4 separate component planes. Like this for row major:

image

(In theory, one can of course have all the other ways of ordering 4 components, 4! in total, but that would be a bit silly...)

For cpp=1 you could have 2 (well, 3 if also you count identity, i.e. no rearrangement, just like you illustrated above).

kmilos avatar Jan 31 '23 09:01 kmilos

Sorry, not following the logic there. I can see how the arrangements i listed can be specified via different frame sizes, but not the other orders.

LebedevRI avatar Jan 31 '23 13:01 LebedevRI

Sorry, not following the logic there. I can see how the arrangements i listed can be specified via different frame sizes, but not the other orders.

Np, I'll try to do a clearer drawing for the three (Adobe, Sony, and BlackMagic) cases a bit later and add here.

kmilos avatar Jan 31 '23 13:01 kmilos

@LebedevRI I hope the following illustration of the scatter/gather schemes makes it clearer:

dng_ljpeg

These are all perfectly valid w.r.t. ITU-T T.81. The components highlighted in yellow make up one MCU, and the dotted ones are used as predictors (usually 1, i.e. just horizontal, but 6 for Blackmagic/CinemaDNG case).

(There is also the trivial case of DNG spp=3 LinearRaw with 1:1 mapping to SOF3 Nf=3, which is already supported for predictor 1, but not predictor 7 using the row above like Apple do.)

kmilos avatar Feb 01 '23 14:02 kmilos