Viewers icon indicating copy to clipboard operation
Viewers copied to clipboard

[Bug] Regression handling RGB in XC modality

Open fedorov opened this issue 1 year ago • 4 comments

Describe the Bug

This sample is no longer rendered correctly: https://viewer.imaging.datacommons.cancer.gov/v3/viewer/?StudyInstanceUIDs=1.3.6.1.4.1.5962.1.2.0.1677425356.1733

image

For comparison, see how it used to work here: https://github.com/OHIF/Viewers/issues/3354#issuecomment-1673389784:

image

To download the files in that study (note - it is ~233GB!), assuming you have python and pip on your system, do the following:

pip install --upgrade idc-index
idc download 1.3.6.1.4.1.5962.1.2.0.1677425356.1733

Steps to Reproduce

Load the sample study referenced in the above into the viewer.

The current behavior

First series shows grayscale with multiple slices arranged in a mosaic. Second does not load at all.

The expected behavior

Both should show up in color as used to work in the past given the screenshot in https://github.com/OHIF/Viewers/issues/3354#issuecomment-1673389784

OS

macOS

Node version

n/a

Browser

Chrome

fedorov avatar Sep 18 '24 18:09 fedorov

For instance, if I want to debug this specific study without downloading it, how can I do that in my local OHIF since you said the DICOMweb server is public?

CC @igoroctaviano

sedghi avatar Sep 18 '24 18:09 sedghi

@sedghi I am not an OHIF developer, so I cannot answer how to do it in OHIF configuration (@igoroctaviano can), but IDC public DICOMweb endpoint is this: https://proxy.imaging.datacommons.cancer.gov/current/viewer-only-no-downloads-see-tinyurl-dot-com-slash-3j3d9jyp/dicomWeb (documented in https://learn.canceridc.dev/portal/proxy-policy). Note that the "no downloads" wording in the URL is legacy, and we do not restrict that DICOMweb endpoint for downloads - as discussed in the link above.

fedorov avatar Sep 18 '24 18:09 fedorov

DICOM PlanarConfiguration will indicate whether color image is ordered as RGBRGB or RRGGBB. Need to investigate.

fedorov avatar Sep 26 '24 18:09 fedorov

@dclunie I checked, and for Visible Human, we have both options for PlanarConfiguration. Neither one works.

SELECT distinct(PatientID), StudyInstanceUID, SeriesInstanceUID, PlanarConfiguration 
FROM `bigquery-public-data.idc_current.dicom_all` 
where Modality = "XC"
PatientID	StudyInstanceUID	SeriesInstanceUID	PlanarConfiguration
VHP-M	1.3.6.1.4.1.5962.1.2.0.1672334394.26545	1.3.6.1.4.1.5962.1.3.0.2.1672334394.26545	0
VHP-F	1.3.6.1.4.1.5962.1.2.0.1677425356.1733	1.3.6.1.4.1.5962.1.3.0.1.1677425356.1733	1
VHP-M	1.3.6.1.4.1.5962.1.2.0.1672334394.26545	1.3.6.1.4.1.5962.1.3.0.1.1672334394.26545	1
VHP-F	1.3.6.1.4.1.5962.1.2.0.1677425356.1733	1.3.6.1.4.1.5962.1.3.0.2.1677425356.1733	0

At the same time, I checked the following 2 US series, with different options

1 | LiverUS-04 | 1.3.6.1.4.1.14519.5.2.1.1188.2803.279513797246517539833297938175 | 1.3.6.1.4.1.14519.5.2.1.1188.2803.240884965319239940503812066809 | 0 |  
2 | C3N-02436 | 1.3.6.1.4.1.14519.5.2.1.7085.2626.149644418574806774239324529564 | 1.3.6.1.4.1.14519.5.2.1.7085.2626.252950437956329338256830139567 |  

and both work fine:

https://viewer.imaging.datacommons.cancer.gov/v3/viewer/?StudyInstanceUIDs=1.3.6.1.4.1.14519.5.2.1.1188.2803.279513797246517539833297938175&SeriesInstanceUIDs=1.3.6.1.4.1.14519.5.2.1.1188.2803.240884965319239940503812066809

https://viewer.imaging.datacommons.cancer.gov/v3/viewer/?StudyInstanceUIDs=1.3.6.1.4.1.14519.5.2.1.7085.2626.149644418574806774239324529564&SeriesInstanceUIDs=1.3.6.1.4.1.14519.5.2.1.7085.2626.252950437956329338256830139567

So it must be something else, or combination with something else...

fedorov avatar Oct 22 '24 20:10 fedorov

Must be something else then. I get the impression that color image handling in OHIF is extremely fragile (whether compressed or uncompressed).

dclunie avatar Oct 23 '24 00:10 dclunie

@dclunie what could be causing the imagine not to show as a single image in current v3 version, but rather a bunch of images, while it's shown as a single image in v2?

pedrokohler avatar Oct 23 '24 10:10 pedrokohler

@dclunie what could be causing the imagine not to show as a single image in current v3 version, but rather a bunch of images, while it's shown as a single image in v2?

@pedrokohler, this is typically related to a decoding issue. I'd suggest rolling back to commits before any recent RGB/decoding changes, testing them one by one.

@sedghi, it might be a good idea to open an issue to add snapshot testing in CS3D to help prevent these kinds of regressions.

Here’s a similar issue we encountered previously: https://github.com/cornerstonejs/cornerstone3D/pull/829

igoroctaviano avatar Oct 23 '24 10:10 igoroctaviano

There's a widespread problem with DICOM color images. Here's the core issue:

  • When dicom servers sends DICOM color images, they can convert the pixel data from one color space (like YBR422) to another (like RGB) as we know, however, these servers frequently don't update the metadata to reflect this conversion. So you end up with a mismatch - the metadata might say "YBR422" even though the actual pixel data has been converted to RGB.
  • Apparently the DICOM standard suggests that the Photometric Interpretation field should reflect the format of the stored data (am I right @dclunie ?). But this raises a question - what should happen when the data gets converted?
  • Due to this widespread inconsistency, we can't reliably ONLY trust the Photometric Interpretation value in the metadata, we should look at the pixelData too. To work around this, we developed a solution that tries to detect the actual pixel data format by analyzing the values themselves (implemented here). While this approach improved our ultrasound image support and made things more consistent, we're still encountering some issues with it as you see in this issue

Take a look at these discussions for instance

https://discourse.orthanc-server.org/t/orthanc-convert-ybr-to-rgb-but-does-not-change-metadata/3533/3

I need help fixing the color settings permanently.

sedghi avatar Oct 23 '24 13:10 sedghi

@sedghi if you can provide a reproducible example demonstrating DICOM non-compliance of Google Healthcare API DICOMweb implementation, we can escalate through our collaborators at Google to try to get it fixed. I do not know enough about this issue to reproduce it myself.

fedorov avatar Oct 23 '24 13:10 fedorov

Can @dclunie clarify the DICOM standard's requirements for Photometric Interpretation? Should we rely on that? since we are seeing inconsistencies with that tag and what actually the server responds in pixelData

sedghi avatar Oct 23 '24 13:10 sedghi

@sedghi @fedorov after lots of testing, I found that this PR was the responsible for making it fail: https://github.com/cornerstonejs/cornerstone3D/pull/829/files

I'll investigate further to see what exactly is happening there

pedrokohler avatar Oct 23 '24 14:10 pedrokohler

@pedrokohler thank you, but let's wait to hear from @dclunie about the question asked by Alireza first!

fedorov avatar Oct 23 '24 14:10 fedorov

Very long ... sorry ...

Here are few points to consider wrt. what the standard says about this and how it relates to @sedghi's description of the recent changes, and what might be needed to make this more robust going forwards:

  • when an image is encoded in an uncompressed transfer syntax, all of the DICOM attributes in the Image Pixel Module need to accurately describe the binary contents of the uncompressed PixelData attribute, including PhotometricInterpretation - e.g., if the pixels are RGB, then that's what it needs to say, if they really are YBR (e.g., were encoded that way be the ultrasound machine) then again, that's what PhotometricInterpretation needs to say (usually YBR_FULL); see "Native" format in PS3.5 and Photometric Interpretation in PS3.3.
  • if an image is decompressed, whether from a lossless or lossy transfer syntax, and re-encoded in DICOM, obviously the decompressed file is required to have the correct description of the decompressed PixelData in the Image Pixel Module attributes - the classic example being when a JPEG image in YBR_FULL_422 is color component converted into RGB during decompression; what the standard requires in this respect is obvious; problems arise when the encoder of the decompressed image fails to set Photometric Interpretation correctly - viewers may try to apply heuristics to work around this (e.g., it's YBR_FULL_422 but decompressed so its probably wrong so I will assume RGB) but like any heuristics, they may fail, as we are apparently seeing here
  • if an image is compressed, whether losslessly or lossy, then the image pixel data characteristics (including the meaning of the color components +/- downsampling of the chrominance channels) may or may not be expressed in metadata within the compressed bit stream, but if it is, that is always to be used "to control the decoding"; that said, the DICOM Image Pixel Module attributes are still required to be "consistent with the characteristics of the compressed data stream" (even though in practice they may sometimes not be); for lossy baseline JPEG, the compressed bitstream might not describe the color components, so DICOM Photometric Interpretation of YBR_FULL_422 is really important to signal this, and RGB for compressed data is intended to indicated that the color components have NOT been transformed (as is the case with Leica/Aperio WSI, for example); the presence of a JFIF header is not required in the compressed bitstream, but if present, may signal YCbCr components rather than untransformed RGB components

From an application implementation perspective this can be even more challenging depending on what the underlying toolkits and libraries and codecs do and how much control there is over what they do (e.g., a lossy baseline JPEG decoder might always attempt YCbCr to RGB conversion even if they really were RGB, or always do it if a JFIF header is present, or check for the Adobe marker segment that signals RGB, or whatever).

To restate the obvious, the only way to mitigate against unexpected code changes causing failures is to test - have a good range of test inputs representing as many known good (and bad) variants as possible, and a decoded result to expect, and automate the comparison in regression tests. For properly encoded DICOM images (uncompressed, compressed and decompressed/re-encoded) there should be no question about the correct result. For "bad" images (e.g., RGB Photometric Interpretation for a YCbCr encoded lossy JPEG bitstream signaled by a JFIF header, or a decompressed JPEG that still has YBR_FULL_422 for the Photometric Interpretation) some heuristics may be implemented to work around this and need to be tested too but the heuristic should never prevent the correct handling of a correct image.

In the case of the Cornerstone PR #829 that was mentioned as responsible for the problem, there is a new function isColorConversionRequired() that is supposed to work around an [Orthanc server bug "Orthanc convert YBR to RGB but does not change metadata"](https://discourse.orthanc-server.org/t/orthanc-convert-ybr-to-rgb-but-does-not-change-metadata/3533/17), and which @sedghi also referred to. Unfortunately it seems that insufficient testing was done to assure that this workaround did not undermine handling of "correct" images in which the codec had handled the color component transformation (and chrominance channel upsampling; repeating it would screw things up, if that is what is actually happening.

The heuristic in isColorConversionRequired() seems to be trying to detect uncompressed chrominance downsampling based on the downsampling description in Photometric Interpretation matching the decompressed pixel data length, which is a whole other thing (and as @sedghi described, is often related to funky ultrasound images; some uncompressed or decompressed RLE ultrasound images really do have uncompressed color component downsampling). I don't actually see how the function (and its consequences if it returns true) actually specifically target the Orthanc defect anyway, other than by assuming that if it says YBR_FULL_422 but the decompressed pixel data length is not downsampled (i.e., it just equals columns * rows * samplesperpixel * bytespersample; I think the imageFrame must already exclude the padding byte to even judging by other code therein).

However, since our problem images should be Photometric Interpretation RGB and uncompressed, I am not sure how this PR is causing our images to fail, since isColorConversionRequired() should return false by my reading of it.

I assume there is no issue with the DICOMweb side of things, in that the image requested was returned uncompressed. If it had been requested with an Accept header of image/jpeg rather than application/octet-stream, and the server has lossy JPEG compressed them to satisfy the frame request (perhaps even retrieve rendered), and of course while doing that would very likely color space transform them, with no explicit means of signaling that (except perhaps a JFIF header included in the compressed bitstream). See also the warning about this sort of thing in the description in PS3.18. I.e., the metadata response and the bulk pixel data response would be inconsistent in such cases wrt. the characteristics of the pixel data. This is purely speculative on my part since I have now idea how OHIF is retrieving these images, or under what circumstances the routines in the PR get invoked.

I also see a bunch of stuff in the PR about Palette Color Lookup Tables, but that should not be relevant to us since they won't be in the problem image.

dclunie avatar Oct 23 '24 15:10 dclunie

@pedrokohler you should also use samples mentioned in https://github.com/OHIF/Viewers/issues/2934! Also samples from DICOM FTP server: ftp://medical.nema.org/MEDICAL/Dicom/DataSets/WG12/.

fedorov avatar Oct 24 '24 17:10 fedorov

@fedorov @dclunie the problem is the isColorConversionRequired function of the PR I sent above.

Once I force it to return true for the dataset above, it renders the image correctly.

@dclunie what else could be checked inside this function so that we would be able to tell that this image requires conversion? For instance, I could check to see if the modality is XC and always return true in this case - note that I don't know if this makes any sense, I'm just giving an example.

pedrokohler avatar Oct 25 '24 16:10 pedrokohler

Once I force it to return true for the dataset above, it renders the image correctly.

Very interesting observation.

It makes little sense though, since this really is an RGB image, so by definition should not require "color conversion".

I can only assume that whatever action is being (needs to be) triggered by isColorConversionRequired() returning true is doing more than "color conversion" - perhaps related to changing PlanarConfiguration of 1 (color-by-plane) to 0 (color-by-pixel)?

I could check to see if the modality is XC and always return true in this case

Please do not do that - you need to understand the root cause of the problem with what is perfectly good input data and fix it. There is nothing special about the modality or any other attribute that is not directly related to the description of the encoding of the pixel data (i.e., Image Pixel Module attributes and Transfer Syntax).

I think you need to figure out what "color conversion" entails and whether or not it can safely be triggered by PlanarConfiguration == 1, or if it needs to be refactored (to separate "color component conversion" (e.g, from YBR to RGB, +/- chrominance channel upsampling) from "plane organization" (color-by-plane to color-by-pixel).

Who wrote the "color conversion" code? You could talk to them about their design for this.

dclunie avatar Oct 25 '24 19:10 dclunie

@dclunie Thank you David for the excellent DICOM color crash course. It will help us improve our color handling stability.

sedghi avatar Oct 28 '24 12:10 sedghi

@fedorov It looks like this was really a cornerstone3D issue but it was already fixed in this PR https://github.com/cornerstonejs/cornerstone3D/pull/1457/files

I'm waiting for Bill Longabaugh to confirm that ViewersV3 does not have the master branch deployed, but rather an older branch that has a previous cs3d version being used.

If this is indeed the case, then we just need to deploy the master branch and the issue should be fixed.

pedrokohler avatar Oct 28 '24 13:10 pedrokohler

I'm waiting for Bill Longabaugh to confirm that idc-slim does not have the master branch deployed, but rather an older branch that has a previous cs3d version being used.

You mean ViewersV3, not idc-slim, right?

fedorov avatar Oct 28 '24 13:10 fedorov

You mean ViewersV3, not idc-slim, right?

My bad. Yes, I mean ViewersV3.

pedrokohler avatar Oct 28 '24 13:10 pedrokohler

Replicating from Slack and to keep @dclunie in the loop. @pedrokohler can you please take a look?

Per @wlongabaugh:

OK, in the past, it appears the > 32 MB slices got sent as roughly ~20 MB slices, and this appeared to be due to a change in the Accept: transfer-syntax=* request header: https://github.com/ImagingDataCommons/IDC-ProjectManagement/issues/1574#issuecomment-1924983645

1:13 OK, I admit I was wrong, and yes it used to work, by getting the > 32 MB slices to work by getting them to be ~20 MB on the wire.

I note the current V3 viewer is sending this:

accept:
multipart/related; type=application/octet-stream; transfer-syntax=*; transfer-syntax=*

1:12 Is the second transfer-syntax=* (and w/o a ";") messing things up?

fedorov avatar Nov 05 '24 19:11 fedorov

Checkout extensions/cornerstone/src/initWADOImageLoader.js that is the location that decide on the transfer syntax

and

platform/core/src/utils/generateAcceptHeader.ts

sedghi avatar Nov 05 '24 19:11 sedghi

@fedorov I removed the duplicate and just tested it with transfer-syntax=* and with transfer-syntax=*;, meaning both with and without semicolon. Still same error occurs in both cases.

image

image

pedrokohler avatar Nov 06 '24 12:11 pedrokohler

@fedorov either way this error has nothing to do with this ticket. According to @wlongabaugh on Slack this same error is occurring in production right now.

pedrokohler avatar Nov 06 '24 12:11 pedrokohler

Thank you for checking. Let's wait to see what @wlongabaugh says about this.

fedorov avatar Nov 06 '24 15:11 fedorov

Sometimes the type should be in quotes. We used to have an option to omit it, as including it can trigger an options request flow on some servers, slowing things down. However, sometimes it requires the type to be in quotes, like multipart/related; type="application/octet-stream"; transfer-syntax=*.

sedghi avatar Nov 06 '24 15:11 sedghi

@wlongabaugh @pedrokohler note that while accessing IDC DICOM store directly - bypassing the proxy - I am able to visualize both series.

See demo here: https://app.screencast.com/QEePMcwuQOuWb.

Also note the type is indeed in quotes:

image

No server error

image

fedorov avatar Nov 06 '24 17:11 fedorov

For this experiment, I utilized the v3 instance deployed by @igoroctaviano via https://github.com/ImagingDataCommons/idc-viewers-sandbox-gha-testing. I do not know where that version is relative to various IDC tiers.

fedorov avatar Nov 06 '24 17:11 fedorov

Could it be that the proxy started dropping the quotes @wlongabaugh?

fedorov avatar Nov 06 '24 17:11 fedorov

What size are the slices that are being sent directly from Google? Are they 32+ MB, or ~20 MB? If Google has taken to ignoring the fact that we are allowing a gzip transfer encoding to be sent, that would be the problem. @fedorov

To clarify, I have no doubt that a direct call to Google would work. The question is whether the direct call returns a compressed slice or not. If it is uncompressed, that yes the Viewer will read it, but it would not survive a pass through the proxy since it is no longer compressed.

wlongabaugh avatar Nov 06 '24 17:11 wlongabaugh