image-tiff icon indicating copy to clipboard operation
image-tiff copied to clipboard

CCITT group 4 (Fax4) decoding support

Open stephenjudkins opened this issue 10 months ago • 11 comments

This adds support for decoding CCIT group 4 tiff images by using the fax crate.

I've found that the photometric interpretation tag is ignored by all the extant decoders I've found, and attempting to correctly interpret it will break some subset of images. Unfortunate, but I suspect it's best to follow the herd here?

stephenjudkins avatar Apr 05 '24 18:04 stephenjudkins

Probably check Tag::PhotometricInterpretation to decide what values are black and white.

s3bk avatar Apr 05 '24 19:04 s3bk

OK. I've done some research and found that

  1. other image processors (imagemagick, Apple Preview) ignore the photometric interpretation tag and always assume that it's WhiteIsZero
  2. fax4-encoded images I've found in the wild (which are all, unfortunately, confidential images of business checks) have the tag set correctly WhiteIsZero
  3. imagemagick creates fax4-encoded tiffs with photometric interpretation incorrectly as BlackIsZero but that doesn't matter because everything that loads the files assumes WhiteIsZero.

To conform with how everything in the wild I'd argue we just keep this hardcoded the way it is?

stephenjudkins avatar Apr 05 '24 21:04 stephenjudkins

hahaha. Yea, I think just keep it as is.

s3bk avatar Apr 05 '24 21:04 s3bk

Ideally, I'd like to see the create_reader act more lazily. The best way would be to have incremental decoding, but the fax crate doesn't seem to support that. An intermediate step might be to generate the list of color transitions, and then lazily expand that into the provided output buffer

fintelia avatar Apr 07 '24 00:04 fintelia

I don't see a reason why incremental decoding could not be implemented.

It would be pretty easy to write something like

impl Decoder<R: Read> {
  fn new(reader: R) -> Self;
  fn advance(&mut self) -> Result<(), io::Error>;
  fn pels(&self) -> Pels;
}

s3bk avatar Apr 07 '24 16:04 s3bk

Yeah, I think an interface like that would work. Then it could be wrapped in another object that implemented Read by tracking how far into the row had been read and expanded Pels pixel by pixel

fintelia avatar Apr 07 '24 17:04 fintelia

Appreciate the feedback and all the back-and-forth here. I'm happy to help out as much as I can here to push this forward, but I don't want to step on @s3bk's feet if he's working on these changes.

stephenjudkins avatar Apr 08 '24 20:04 stephenjudkins

@stephenjudkins I pushed changes to the fax repo. There are strange differences with the last line again. I need to re-encode my samples with libtiff to check if the error is in the sample data or my code.

But you should be able to use the code to adapt this PR to the next version.

s3bk avatar Apr 08 '24 21:04 s3bk