bioformats
bioformats copied to clipboard
TiffParser: handle REFERENCE_BLACK_WHITE as an array of floats
Backported from a private PR
This addresses an edge case where the ReferenceBlackWhite TIFF tag is read as an array of floats.
In the majority of the TIFF-based samples this tag contains an array of 6 integers as shown by tiffdump
e.g. using a public NDPI sample:
% tiffdump CMU-1.ndpi | grep Refere
ReferenceBlackWhite (532) LONG (4) 6<0 255 128 255 128 255>
ReferenceBlackWhite (532) LONG (4) 6<0 255 128 255 128 255>
ReferenceBlackWhite (532) LONG (4) 6<0 255 128 255 128 255>
ReferenceBlackWhite (532) LONG (4) 6<0 255 128 255 128 255>
ReferenceBlackWhite (532) LONG (4) 6<0 255 128 255 128 255>
TiffParser uses the IFD.getIFDIntArray
API to retrieve the tag content - see
https://github.com/ome/bioformats/blob/1477ee96e5215c55f15cd3efdb6fd91e386a0023/components/formats-bsd/src/loci/formats/tiff/TiffParser.java#L1253
As per the TIFF specification, ReferenceBlackWhite is an array of type Rational
i.e. a rational number. I do not have a representative sample to share for testing and I could not find existing sample file in the curated QA repository. In this case, the output of tiffdump
will be an array of 6 floats:
% tiffdump ReferenceBlackWhite_sample.ndpi | grep Referen
ReferenceBlackWhite (532) FLOAT (11) 6<0 255 128 255 128 255>
ReferenceBlackWhite (532) FLOAT (11) 6<0 255 128 255 128 255>
ReferenceBlackWhite (532) FLOAT (11) 6<0 255 128 255 128 255>
ReferenceBlackWhite (532) FLOAT (11) 6<0 255 128 255 128 255>
ReferenceBlackWhite (532) FLOAT (11) 6<0 255 128 255 128 255>
For such files, setId
will not be affected but calling openBytes
will trigger a loci.formats.FormatException
with the following message REFERENCE_BLACK_WHITE directory entry is the wrong type (got [F, expected Number, int[] or Number[])
. This PR proposes to update TiffParser
to handle this scenario like libtiff
does. If getIFDIntArray
throws an exception, the parsing class will attempt to retrieve the IFD value as a float array and cast each value as int array for compatibility with the existing reference
field.
This PR does not attempt to handle the more generic case where the ReferenceBlackWhite tag would be a true array of floats that cannot be caster as integers. This would definitely be legit as per specification but handling it will require a bigger review of the correction code when unpacking the bytes and to date, I have no representative sample for this scenario.
Given the importance of TiffParser
, we will definitely want to make sure this does not cause any regression on any TIFF-based file format.