dcmjs icon indicating copy to clipboard operation
dcmjs copied to clipboard

fix: 🐛 IDC1346, properly detect binary segmentations declared as fractional

Open Punzo opened this issue 2 years ago • 17 comments

this fix https://github.com/OHIF/Viewers/issues/1346.

https://viewer.imaging.datacommons.cancer.gov/viewer/1.3.6.1.4.1.14519.5.2.1.7695.4164.330974290504454152904316943429 has two segmentations: 1) Screenshot from 2022-09-14 13-59-19

this one was actually a binary seg (but the DICOM multiframe.SegmentationType is set to fractional). The seg array values are [0,1] (note: the multiframe.MaximumFractionalValue is 255).

There was already a check for this case (binary seg, but defiined as fractional), but the check was checking if the array values were either [0,multiframe.MaximumFractionalValue]. Now I changed in a way that it will check if the values are [0, firstNonZeroValue]. In this case the seg will be loaded as a normal binary.

Screenshot from 2022-09-14 13-59-37

here the seg has various values in the array (something like: [0, 12, 33, 48]). For this case, I added a flag (default to true) for converting the seg into a binary one. The conversion is done with a simple thresholding where I use the maximum value found in the array (multiframe.MaximumFractionalValue does not look be reliable number, at least for IDC datasets) and I multiply it for a thesholdingParameter (80% in default).

@pieper can you please review? thanks!

Punzo avatar Sep 14 '22 12:09 Punzo

@pieper some tests are failing, but they not related to this PR, and they fail on master branch too. It looks like first detect was at commit https://github.com/dcmjs-org/dcmjs/commit/42a7ca7d007ef12a0f287d9b23771c1516fab6ab

Punzo avatar Sep 14 '22 12:09 Punzo

@Punzo thanks for digging into this 👍

Do we know what the segmentations are supposed to look like for this dataset? From what I see the first image might be correct but the second looks odd. My guess is that the submitters to TCIA had labelmaps and decided it was easier to put them in a single segment and call it fractional rather than going through the process to make bitplanes.

The images are described here but it's hard to tell how that would map to what we see here. They say "Derived objects from the DWI and DCE acquisitions are included for the convenience of the user. Some of these are not strictly DICOM compliant and may not be readable by all DICOM software packages." So I'm not sure what we should expect.

Regarding the tests, yes, there's a PR to fix that: https://github.com/dcmjs-org/dcmjs/pull/308 following conversation here: https://github.com/dcmjs-org/dcmjs/pull/303

pieper avatar Sep 14 '22 14:09 pieper

Do we know what the segmentations are supposed to look like for this dataset?

I am afraid we do not!

Very unfortunately, David Newitt at UCSF, who did this conversion using in-house matlab code (I believe), retired, and there isn't anyone to ask for clarifications about those segmentations.

These are the collections where FRACTIONAL is encountered: ispy2, qiba_ct_1c, breast_mri_nact_pilot, acrin_6698, ispy1. All except the QIBA one come from David Newitt / UCSF, from what I know :-(

I can generate sample URLs for image+segmentation pairs from each of the collections mentioned, if this helps. But this becomes complete guesswork... Maybe indeed we should shelve support of those collections until we have FRACTIONAL support implemented.

fedorov avatar Sep 14 '22 15:09 fedorov

@Punzo thanks for digging into this +1

Do we know what the segmentations are supposed to look like for this dataset? From what I see the first image might be correct but the second looks odd. My guess is that the submitters to TCIA had labelmaps and decided it was easier to put them in a single segment and call it fractional rather than going through the process to make bitplanes.

The images are described here but it's hard to tell how that would map to what we see here. They say "Derived objects from the DWI and DCE acquisitions are included for the convenience of the user. Some of these are not strictly DICOM compliant and may not be readable by all DICOM software packages." So I'm not sure what we should expect.

not sure, but the second segmentation is defined as a mask, so maybe is the mask used in the seg and why is so ackward. Please let me know if I should take any other action.

Regarding the tests, yes, there's a PR to fix that: #308 following conversation here: #303

Nice!

Punzo avatar Sep 14 '22 15:09 Punzo

Maybe indeed we should shelve support of those collections until we have FRACTIONAL support implemented.

Up to you, but even with proper support, the shape of the segmentation will not change respect the one in my second screenshot (only the edges will be "smoothed" with proper support). In the conversion to binary, I just have essentially taken the pixels with highest value (48) and put their value to 1 (and 0 to all others pixels). So the shape of seg would not change. See https://github.com/dcmjs-org/dcmjs/pull/309/files#diff-077798721d0df020ee8184a2f54d4459dba916e19c86ac0f0aecc4e6ea80bf2fR1274-R1303

Punzo avatar Sep 14 '22 15:09 Punzo

Maybe indeed we should shelve support of those collections

Given this discussion that's what I would vote for.

pieper avatar Sep 14 '22 16:09 pieper

Maybe indeed we should shelve support of those collections.

Given this discussion that's what I would vote for.

Ok then for the second segmentation, i will just remove the code that does the thresholding and I will reput the warning that fractional segs are not supported. The first seg will be loaded as a binary.

Punzo avatar Sep 14 '22 21:09 Punzo

Maybe indeed we should shelve support of those collections.

Given this discussion that's what I would vote for.

Ok then for the second segmentation, i will just remove the code that does the thresholding and I will reput the warning that fractional segs are not supported. The first seg will be loaded as a binary.

Done in https://github.com/dcmjs-org/dcmjs/pull/309/commits/4736ff15c326310b73bb5741df58ead959955561 @pieper @fedorov. Please let me know if I need to address anything else. As soon as this will be merged, I will update OHIF.

Punzo avatar Sep 14 '22 21:09 Punzo

Maybe indeed we should shelve support of those collections

Given this discussion that's what I would vote for.

P.S.: if you meant that we should just not support the full collection, I will leave to you to close this PR.

Punzo avatar Sep 15 '22 09:09 Punzo

Good question @Punzo . @fedorov do you see any value in supporting the special case of a fractional seg that only has one non-zero value? I tend to think it is better to be consistent and avoid displaying something that is not cleanly encoded.

pieper avatar Sep 15 '22 13:09 pieper

@fedorov should we close this?

Punzo avatar Sep 19 '22 10:09 Punzo

I feel bad about discarding all the work you have done, but I agree with Steve - implementing custom rules based on assumptions can be dangerous. Let me also discuss this with @dclunie before we finalize the decision and close this PR.

fedorov avatar Sep 19 '22 13:09 fedorov

For the sake of completeness, this dataset is from the collection documented here: https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=50135447#501354470d1c0b53772a4fba8580e8b19f65069b

Specifically, there is Analysis mask files description. However, conventions described in this document do not seem to be followed in the sample investigated here.

image

fedorov avatar Sep 29 '22 13:09 fedorov

I looked at one of these, and even though it does not appear to follow the exact description mentioned, it does look suspiciously like a bit mask with various combinations bits from 0x00 through 0x20 set:

% dchist -h ACRIN-6698-102212/04-04-2002-102212T0-ACRIN-6698ISPY2MRIT0-82630/61900.000000-ISPY2\ VOLSER\ uni-lateral\ cropped\ Analysis\ Mask-08921/1-1.dcm

[0] 42423 p=0.00809155 e=0.0562311 cum=0.0562311 [0x1] 66288 p=0.0126434 e=0.0797228 cum=0.135954 [0x2] 453 p=8.64029e-05 e=0.00116631 cum=0.13712 [0x11] 117012 p=0.0223183 e=0.12243 cum=0.25955 [0x20] 21569 p=0.00411396 e=0.0326042 cum=0.292154 [0x21] 345053 p=0.0658136 e=0.258349 cum=0.550504 [0x22] 3493 p=0.000666237 e=0.00702992 cum=0.557534 [0x31] 4646589 p=0.886267 e=0.154377 cum=0.71191

This is obviously not a valid use of a DICOM SEG object if that is indeed the case (nor is what is described in the document).

But potentially something that could be split into separate binary segmentations if someone wanted to bother.

I don't think we (IDC) should expect any viewer to support such an invalid object though, regardless.

dclunie avatar Sep 29 '22 19:09 dclunie

Found a private data element within the SEG object that describes the mask:

(0x0117,0x10d3) LO Mask Legend VR=<LO> VL=<0x0088> <00000001 ( 1) : PE threshold\00000010 ( 2) : other SER filter\00010000 (16) : automatic background threshold\00100000 (32) : manual VOI >

dclunie avatar Sep 29 '22 19:09 dclunie

Thank you for investigating @dclunie

I don't think we (IDC) should expect any viewer to support such an invalid object though, regardless.

Agreed, let's close this as an issue but perhaps add a link to this discussion someplace. Do we have something like release notes or a "known issues" document to collect links to discussions like this?

pieper avatar Sep 29 '22 20:09 pieper

Perhaps we should put in a TCIA HelpDesk request to correct the invalid SEG objects? In that, we could document the problem and suggest a solution.

dclunie avatar Sep 29 '22 20:09 dclunie