tiff icon indicating copy to clipboard operation
tiff copied to clipboard

TIFF LZW style - When to change the code length

Open lpatiny opened this issue 4 years ago • 1 comments

Looking at the spec: https://www.fileformat.info/format/tiff/corion-lzw.htm

The function GetNextCode() retrieves the next code from the LZW- coded data. It must keep track of bit boundaries. It knows that the first code that it gets will be a 9-bit code. We add a table entry each time we get a code, so GetNextCode() must switch over to 10-bit codes as soon as string #511 is stored into the table. We need to change the code length as soon at #511 is stored.

However in the code the change is done at 510:

https://github.com/image-js/tiff/blob/73ca97100c0674855db1be3aa71bb5385785a09c/src/lzw.ts#L94-L96

I don't know what is the correct version but the confusion could be due to TIFF version:

https://stackoverflow.com/questions/26366659/whats-special-about-tiff-5-0-style-lzw-compression

  • LZW codes are written to the stream in reversed bit order.
  • "New-style" increases the code size one symbol earlier than "old-style" (so-called "Early Change").

The current implementation in 'debug-lzw' branch seems however correct based on the lzw images we have and the comparison with convert from imagemagick.

https://github.com/image-js/tiff/commit/199aa6da9d01df5d5a432339587c3d57aa83a5ec

lpatiny avatar Oct 30 '21 09:10 lpatiny

However in the code the change is done at 510:

I think it's just because #511 is if you count from one, and we count from zero.

targos avatar Oct 31 '21 07:10 targos