FLIF
FLIF copied to clipboard
RGGB compression somtimes worse than RawSpeedCompress+LZMA2
In my ongoing tests with RGGB created by dcraw compression of FLIF is sometimes worse than RawSpeedCompress+LZMA2. RawSpeedCompress uses different Delta versions for uncompressed pixel data and applies ZigZag-encoding which helps all kinds of general purpose compressors.
https://drive.google.com/file/d/0ByLIAFlgldSoNFR6RmcwdTgxQTQ/view?usp=sharing leica_m82_05.rggb 11.359.612 (FLIF 0.1) leica_m82_05.rggb 10.928.429 (FLIF 0.1.3_slower_but_stronger -n) leica_m82_05.rggb 5.560.365 (RawSpeedCompress3 + LZMA2)
https://drive.google.com/file/d/0ByLIAFlgldSoalhkSkdsNmJNaDA/view?usp=sharing nikon_1_v2_17.rggb 9.931.720 (FLIF 0.1) nikon_1_v2_17.rggb 9.636.696 (FLIF 0.1.3_slower_but_stronger -n) nikon_1_v2_17.rggb 8.810.004 (RawSpeedCompress3 + LZMA2)
https://drive.google.com/file/d/0ByLIAFlgldSoZmJDUXl2YWVPS2M/view?usp=sharing nikon_d5200_14.rggb 24.334.053 (FLIF 0.1) nikon_d5200_14.rggb 23.684.359 (FLIF 0.1.3_slower_but_stronger -n) nikon_d5200_14.rggb 21.527.675 (RawSpeedCompress3 + LZMA2)
Can you please check why FLIF is worse here? Will future versions also get something like Delta+ZigZag+LZ?
In almost every case the -n switch provided better compression on RGGB
Could you try renaming the RGGB file in PPM and compression it in FLIF interlaced ?
I wrote a simple crusher script for flif (matthiaskrgr/flifcrush), you can try running it on one of the images:
export FLIF=/path/to/flif/binary
main.py nikon_d5200_14.rggb
It will take a lot of time but it might get the size down a bit further.
edit: BE CAREFUL, it might need a lot of ram
I found out what is happening. I know how to solve this. I just need to implement it. Soon FLIF will beat that other method.
Here you go:
11,359,333 leica_m82_05.rggb.flif0.1.3
5,117,762 leica_m82_05.rggb.flif0.1.4
8,450,275 nikon_1_v2_17.rggb.flif0.1.4
20,384,559 nikon_d5200_14.rggb.flif0.1.4
There are still RGGB that compress better with Delta + ZigZag + LZMA2 than they do with FLIF:
https://drive.google.com/file/d/0ByLIAFlgldSoNlR4ZURaTUE2QTg/view?usp=sharing pentax_q10_19.rggb 10.692.568 (FLIF 0.1.4) pentax_q10_19.rggb 10.167.315 (Delta + ZigZag + LZMA2)
https://drive.google.com/file/d/0ByLIAFlgldSoSkZyQzJvWFNpQTQ/view?usp=sharing panasonic_lumix_dmc_gh3_10.rggb 14.186.367 (FLIF 0.1.4) panasonic_lumix_dmc_gh3_10.rggb 13.496.039 (Delta + ZigZag + LZMA2)
PLC helped a lot - on most files FLIF is stronger than without it, and sometimes it also outperforms RawSpeedCompress3 + Delta + ZigZag + MCM 0.83 -x9 but I think it has the capability of beating it on every RGGB image. can you check that again, please? See the almost complete test here: http://www.squeezechart.com/camera-raw.html
As I see, Flif outperform the other method by >1% on average, that is really good. Because each compressor has its own method, sometimes one compressor that usually be worse than others can be more efficient for some case.
You can check some plots at issue #144 you will see that FLIF can sometimes outperform JPEG in its own sandbox, and sometimes webp outperform FLIF.
The Bayer CFA layout is different for different camera models; FLIF currently assumes some specific subpixel layout for its YIQ transform, and if this assumption is wrong, then compression suffers a bit.
You can use flif -R
to disable YIQ. Also, disabling channel compactification with -C
can help.
14,005,394 panasonic_lumix_dmc_gh3_10.rggb.R.flif 13,416,647 panasonic_lumix_dmc_gh3_10.rggb.RC.flif
10,481,645 pentax_q10_19.rggb.R.flif 10,481,603 pentax_q10_19.rggb.RC.flif
Does anyone have a good source of information on these subpixel layouts? I tried reading the dcraw
source code, but that code is not very readable...
Dcraw's exif output can show the CFA pattern as a matrix {Color1, Color2}{Color3, Color4}. DNGs and many (all?) RAWs formats has an Exif data for it.
Edit: I've found the raw files of panasonic_lumix_dmc_gh3_*.rw2 here http://www.photographyblog.com/reviews/panasonic_lumix_dmc_gh3_review/sample_images/ The CFA patern is GB/RG
maybe RawSpeed can help? https://github.com/klauspost/rawspeed/tree/Rawspeed they have a camera.xml database for supporting new cameras: https://github.com/klauspost/rawspeed/blob/develop/cameras-xml.md
I found that Exiftools does not mention the CFA pattern of the RW2, libopenraw documentation say that the pattern should be BGGR and DCRAW say the CFA pattern of the file is GBRG.
Which one to trust ?
From "ISO TC 42 N 5737" TIFF & DNGs allows patterns matrix bigger than 2x2 and can use Red, Green, Blue, Cyan, Magenta, Yellow or White colors. TIFFtag name is CFAPattern (0x828e), maybe you could check this value before converting it.
Also: how does DCRAW output the extracted data? When FLIF reads a .rggb file, it is using the following layout, which I think is what DCRAW produces:
GG RB
Another complication is that these camera raw formats do not seem to have the same range for R, G and B. I'm not even sure if G1 and G2 are always in the same range. So that also messes up the YIQ.
Probably the way to get best compression (if you're not interested in progressive decoding) is to use flif -Rbn
, with or without -C
. You can usually improve further by using higher -r
and -S
(simultaneously), e.g. flif -RCbnS120 -r8
.
I'm sorry but DCRAW outputs with the original pattern, the most common case is: RG GB
I'm working at image-rggb.cpp there is also an issue for maxval < 0xff at blue plane
Thanks, @psykauze . Now we should probably preprocess raw input to be in RG GB format.
You should also change the decoder part ;)
Edit.: I've made some tests with "only" the planes correction here the results: filename,before,after TEST/canon_eos_5d_mark_iii_05.rggb,20098313,20175045 TEST/canon_eos_6d_14.rggb,19328964,19460476 TEST/canon_eos_m_04.rggb,18844941,18949992 TEST/fujifilm_finepix_x100_11.rggb,9652032,9640802 TEST/fujifilm_x_e1_20.rggb,11347023,11308835 TEST/fujifilm_xf1_08.rggb,6197773,6216062 TEST/leica_m82_05.rggb,5117725,5132863 TEST/leica_x1_10.rggb,9790809,9735075 TEST/nikon_1_v2_17.rggb,8449538,8416517 TEST/nikon_d4_10.rggb,15701302,15851157 TEST/nikon_d5200_14.rggb,20384199,20444366 TEST/olympus_epm2_16.rggb,13435899,13375579 TEST/olympus_om_d_e_m5_24.rggb,13260184,13345509 TEST/olympus_xz2_10.rggb,10397033,10394995 TEST/panasonic_lumix_dmc_gh3_10.rggb,14186056,14233905 TEST/panasonic_lumix_g5_15.rggb,14095164,14130656 TEST/pentax_k5_ii_12.rggb,18068348,18073325 TEST/pentax_q10_19.rggb,10692439,10398867 TEST/samsung_nx1000_19.rggb,14594902,14513184 TEST/samsung_nx20_01.rggb,15892754,15799060 TEST/sony_a55.rggb,13066285,13096163 TEST/sony_a77_08.rggb,17466963,17564793 TEST/sony_a99_04.rggb,15125553,15079356
So... I've made some analysis on the RAW files and found there is also rotation to take account. The results after rotating images for getting the right CFA pattern is worse than before.
Here the new results: log-newRGGB.txt
Maybe one image per camera model is not enough to really draw conclusions, but I wonder what is going on here. The differences in compression when changing the CFA pattern are not huge anyway, which suggests to me that doing the YIQ transform on R G1 B while storing 1+G2 in the 'alpha' channel is not the best approach -- it was just a quick & dirty hack anyway, to match RGGB with the RGBA functionality that was already there. Maybe a custom color transform for RGGB would be better.
I would also vote for a custom color transformfor RGGB. At least for Sony there is an explanation: http://diglloyd.com/blog/2014/20140212_2-SonyA7-RawDigger-posterization.html
The Fuji CFA pattern is different from others - see wikipedia: https://en.wikipedia.org/wiki/Bayer_filter
That's why I disqualified the fuji x e1.
Also, I would like to try the R, (G1+G2)/2, B, (G1-G2) algo.
You're right too concerning the R, G and B ranges, there's is correction factor for each color.
Converting to R, (G1+G2)/2, B, (G1-G2) is not lossless: one bit is lost in the division by two and it cannot be recovered. But we could try to directly apply a modified variant of YIQ, maybe something like this: Y = R+B+G1+G2 I = R - B Q = R+B -G1-G2 Gdiff = G1-G2
I'm not sure if encoding G1-G2 is really more efficient than just encoding G1 or G2. The difference is expected to be closer to zero, but it needs an extra bit to be represented.
There's no lost on R, (G1+G2)/2, B, (G1-G2) conversion if you check if numbers is odd or even:
X = (G1+G2)/2 (Always round to lower int like N.5 => N); W = G1-G2; Case W odd: G1 = X + W/2 ; G2 = X - W/2 ; Case W even: G1 = X + (W+1)/2 ; G2 = X - (W-1)/2;
You're right, difference need to be represented with an extra bit but I think this W extra bit can be moved to X in some cases (like if Xdec+Wdec/2 > 0xFFFF then that means Xdec = Xenc << 1 + extra_bit). I need some calculations to check if it is possible.
What is Y in that case distinction?
Y was W I've edited the post sorry.
Ah, right, the parity of the sum is equal to the parity of the difference so there would indeed be enough information to restore the exact sum. I'll try that transformation.
Also I'm assuming that raw files have no more than 14 bpc, is that a valid assumption? It helps if we want to represent G1-G2 in an uint16_t...
I haven't see yet sensors up to 14bits so I think this is a valid assumption.
Be carreful, (G1 - G2) is a signed value but maybe you want store the data as '(Diff << 1) + sign' ?
I'm storing it as 0x4000 + G1 - G2, which is always > 0 if the numbers are at most 0x3fff.
But unfortunately, my preliminary results (just testing on a single image) seem to indicate that any manipulation on the R G1 G2 B values (sum, avg, diff) just makes compression worse.
Could you send me your new image-rggb.cpp, please ? I would like to try some things with the Stephan set.
I pushed it as commented-out code (only in the rggb loading part, not in the decode/save part),
I've tried with R,mean(G),B,diff(G) with many hacks but it is still very worse (+20-50%) than before.
Also, I think I understand why the results was worse after we corrected the RGGB pattern. The red plane is about half the luminosity than it should (usualy corrected by RAW decoders) and blue plane is ~70%. So maybe the YIQ transform is not efficient in this case.
I will try this hack: R => Alpha plane, G2 => Red Plane, G1 => Green plane, B => Blue Plane and compare results.
There are also Bayer sensors that capture in 16-Bit. For example Mamiya-Leaf and Phase One raw images and also Hasselblad raw images. I will upload some samples (converted with DCRAW) later.
I made some (long) tests in interlaced mode with some hacks:
- I've rotated and mirrored the greyscales for having the same (RGGB) pattern on every images. You should see the transformation in the filename.
- I've disqualified the sigma and the finepix x-e1 pictures due to their attypic CFA pattern
- Test named "before" is before modifications of image-rggb.cpp
- Test "RGGB corrected" is just after finding there is an issue to the demosaicing algo in image-rggb.cpp
- Test "RGGB+bpp": I've made a bit-depth detection before initialising the planes
- Test "RGGB+invertG2": I've only inverted the alpha plane (representing G2) with "maxval - pixel" instead of "1 +pixel"
- Test "bpp+invertG2": All corrections above
- Tests "G2G1BR", "G2G1RB", "G2RBG1" and"G1RBG2": I've tried to change planes order to see what happening (G2RBG1 is like before the RGGB correction)
- Test "all invert": All planes was inverted ("maxval - pixel")