bioformats icon indicating copy to clipboard operation
bioformats copied to clipboard

bug: bfconvert generates bad DICOMs from SVS files

Open jcupitt opened this issue 1 year ago • 3 comments

Hello everyone, thank you for this nice thing.

While working on openslide, I've come across a bfconvert bug when generating DICOM files.

DICOM has a photometricinterpretation tag to indicate either RGB or YCbCr colorspace in tiles. This (according to the DICOM spec) is the place where the tile colourspace is kept, and NOT in the JPEG tiles themselves. On decode, you need to open each tile, and force the tile colorspace from the DICOM header.

If you use -precompressed, conversion will copy over the JPEG tiles untouched, so if the DICOM header and the tile colorspace were correct beforehand, they will still match in the converted image.

If you convert without the precompressed flag, bfconvert will reencode the JPEGs and may well change the photometric interpretation. For example, SVS is saved as RGB (no chroma subsample), but bfconvert will save as YCbCr (chroma subsample). Now the DICOM photometric interpretation will be RGB, but the tiles will be YCbCr, so users will see crazy colors.

tldr: when saving DICOM, if tiles are being recompressed, bfconvert needs to update the DICOM photometric interpretation tag.

Referring openslide issue: https://github.com/openslide/openslide/pull/558

Referring libdicom issue: https://github.com/ImagingDataCommons/libdicom/issues/80

jcupitt avatar Mar 08 '24 15:03 jcupitt

Also, even with -precompressed, bfconvert re-encodes the label and overview images to YCbCr but leaves their PhotometricInterpretation values as RGB.

bgilbert avatar Mar 09 '24 14:03 bgilbert

Thanks for reporting this, @jcupitt / @bgilbert. Just to make sure I understand before implementing a fix, is the following what you would expect to be sufficient:

  • only the PhotometricInterpretation (i.e 0x0028,0004) needs to be changed
  • if there are 3 samples per pixel, -precompressed was used, and no re-encoding was performed, then PhotometricInterpretation should be RGB
  • if there are 3 samples per pixel, -compression JPEG was used, and either -precompressed was used but re-encoding needed to happen anyway or -precompressed was not used, then PhotometricInterpretation should be YBR_FULL_422
  • if there are 3 samples per pixel, -compression was omitted or set to something other than JPEG, then PhotometricInterpretation should be RGB
  • if there is 1 sample per pixel, PhotometricInterpretation should remain MONOCHROME2

/cc @dclunie, @fedorov

melissalinkert avatar Mar 13 '24 23:03 melissalinkert

Hi @melissalinkert, thanks for working on this.

I think you can assume the input file is correct, so if you just copy over the JPEG images, you don't need to do anything.

If you do a decompress or compress, you need to look at and perhaps set the DICOM photometricinterpretation tag.

  1. On decompress you need to set the libjpeg input colorspace from the DICOM metadata (don't rely on libjpeg to get this right, it'll miss with eg. SVS) during decompressor setup.
  2. Conversely, on compress, you need to set the DICOM tag to the colourspace you compressed the JPEGs to.
  3. And as Benjamin says, this also applies to the thumbnail / macro / label etc.

jcupitt avatar Mar 14 '24 01:03 jcupitt