dezoomify-rs icon indicating copy to clipboard operation
dezoomify-rs copied to clipboard

Option to losslessly join JPGs

Open otofoto opened this issue 10 months ago • 5 comments

Add option to lossless join JPGs after downloading without need to manually run external tools. JPEGjoin or similar.

otofoto avatar Feb 09 '25 18:02 otofoto

That would be a very useful feature indeed, but it does require some work. @otofoto , would you be interested to take on this task and open a draft pull request ? I'll review it.

lovasoa avatar Feb 10 '25 13:02 lovasoa

Apologies if this is a dumb question, but I'm confused as to what is even really being asked here?

The whole point of Dezoomify is to extract and assemble tiled panoramas, and I would assume that it does the join the tiles losslessly to begin with?

Does it end up using lossy compression in some cases, or something?

I have noticed that at times Dezoomify-rs will spit out images at a lower file size then the web version given the same input xml or json or url (though other times, rs serves a higher resolution image then what the web version can handle?), so now I am wondering if that plays into this (and i'm curious what's going on in the situations I mentioned, I want to be able to extract/save the highest quality, as close to the original native image possible!)

jabberwockxeno avatar Feb 27 '25 00:02 jabberwockxeno

Here is my best effort to explain lossy and lossless jpeg joining.

1. What is a JPG?

JPG reduces file size by discarding some image data. This is called lossy compression, meaning some details are lost forever to save space.


2. How JPG Works

When a JPG is created:

  1. Color Conversion: The image is split into brightness (luminance) and color (chrominance). Humans notice brightness more, so color details are reduced.
  2. Divided into Blocks: The image is cut into small 8x8 pixel blocks.
  3. Math Magic (DCT): Each block is transformed into a combination of predefined patterns (like edges or smooth areas) using math. This step organizes the data from "important" to "less important" (fine grained details).
  4. Throwing Away Details (Quantization): The "less important" patterns are rounded or removed. This is where loss happens!. How much details are thrown away is configurable through a "quality" parameter. In dezoomify-rs, this is controlled through the --compression command line parameter, which defaults to 20.
  5. Final Compression: The remaining data is packed tightly using a clever coding method. This does not discard any additional data.

When you open a JPG, the process is reversed, but the lost details can’t be recovered.


3. How JPGs Are Usually Joined (Lossy Method)

  1. Open the JPGs: Each JPG is decompressed into pixels. The decoded pixels are not the exact same as the ones that were in the original image, because of the process explained above.
  2. Stitch Them Together: Combine the pixels into one big image.
  3. Save as JPG: The new image is compressed again into a new JPG, losing slightly more details.

This is how almost all software that joins JPG works, including the current versions of dezoomify and dezoomify-rs.

The additional quality loss is usually not noticeable to the human eye if the final compression ratio is low enough. 20 is usually fine, but you can set it even lower (at the expense of larger files).


4. What is Lossless Joining?

Lossless joining avoids re-compressing the image, so no extra details are lost. Instead, it combines the compressed data directly. This is tricky and requires specific conditions:

Conditions for Lossless Joining
  1. Same Settings: Both JPGs must use the same compression settings (like how much color is reduced).
  2. Block Alignment: The images must align perfectly with the 8x8 pixel blocks used in JPGs. For example, if one image ends halfway through a block, it won’t work.
  3. No Overlaps: The images must fit together like puzzle pieces without overlapping.
  4. Matching Color Reduction: If one image reduces color in a certain way, the other must do the same.

5. How Lossless Joining Works

  • The compressed data from each JPG is stitched together directly, without decompressing or re-compressing.
  • Special algorithms are needed to check if the images meet the conditions and combine them correctly.

6. Why Lossless Joining is Rare

  • Most software (like Photoshop) uses the lossy method, which is simpler but loses quality.
  • Lossless joining requires careful preparation and specific jpeg decoders.

Summary

  • Lossy Joining: Opens and re-saves JPGs, losing some quality. Works on any set of tiles.
  • Lossless Joining: Combines JPGs without re-compressing, keeping quality intact. More difficult to implement, and only works when all individual tiles meet specific conditions.

lovasoa avatar Feb 27 '25 08:02 lovasoa

Does it end up using lossy compression in some cases, or something?

Same goes for rotation if it is not done on-the-fly using EXIF rotation tag. You lose quality on rotation when resaving but it is also possible to rotate losslesly by 90 degrees increments.

Read this: betterjpeg.com/lossless-rotation.htm

otofoto avatar Feb 27 '25 17:02 otofoto

I too would love to have this lossless jpeg saving option, preferrably as default instead of recompression at Jpeg 80. Of course it wouldn't work for really huge images (jpeg seems to have a size limit of 64k).

BrokenBrainiac avatar Mar 10 '25 06:03 BrokenBrainiac