fix: handle XA modality multi-frame grayscale dicom redaction
Change Description
Fixed handling of XA modality (X-Ray Angiography) DICOM images that were previously failing with
ValueError: Too many dimensions: 3 > 2 when passed to Image.fromarray.
- Added
_dicom_np_to_pilhelper to robustly handle conversion of multi-frame grayscale and RGB images. - Updated
_get_most_common_pixel_valueto correctly handle multi-frame grayscale by taking the first frame. - Updated
_set_bbox_colorto use_dicom_np_to_pilfor consistency across grayscale and RGB images. - Ensured redaction works as expected for XA modality and similar multi-frame images.
- Verified fix with local tests (
test_dicom_redactor.py) β all tests passed.
Issue reference
Fixes #1731
Checklist
- [x] I have reviewed the contribution guidelines
- [x] I have signed the CLA (if required)
- [x] My code includes unit tests (local test added to validate XA modality fix)
- [x] All unit tests and lint checks pass locally
- [x] My PR contains documentation updates / additions if required
Hi @omri374 @joelbyler π
Iβve submitted a fix for issue #1731 (handling XA modality multi-frame grayscale DICOMs).
The changes add a _dicom_np_to_pil helper and improve handling in _get_most_common_pixel_value and _set_bbox_color to avoid the ValueError: Too many dimensions error.
Could you please take a look when you get a chance? Thanks a lot!
Hi @MRADULTRIPATHI thanks for the PR! Please add unit tests.
Hi @omri374
Thanks for the feedback earlier. I've added the requested unit test for multi-frame grayscale DICOM handling.
All tests are passing locally
Please let me know if youβd like any further changes.
I'll test this out today, one thing that does come to mind, and its probably outside the scope of this change, but presidio dicom processing should really support multiframe image redaction. Its definately possible that if there is one frame with potential PHI included, there will likely be multiple frames with that same PHI included. perhaps it will always be in the same place in all or most frames, but possibly not. I'm thinking this is a separate issue though.
Thanks @joelbyler
Agreed β multi-frame redaction across all frames would definitely be valuable, and I agree it should be tracked as a separate enhancement issue.
When you test it out, could you please share the results here as well? That would be really helpful.
I have run my tests locally using some test files and this does resolve the exception that I was seeing, and I can confirm that presidio is stlil not processing multliple frames for redaction, but that's not a regression at this point. I'll log a separate issue to provide image redaction support for multiframe dicom instances.
The files I have include synthetic PHI and computer generated pixeldata, glad to share those if its of interest.
Actually I'll share them here for reference, but also add them to the new issue. multiframe_test_images.zip
Hi @joelbyler
Thanks a lot for reviewing and confirming that the exception is resolved π
Since this PR addresses the reported issue (#1731) and the separate multi-frame enhancement is now being tracked in #1737, I just wanted to check β is this PR ready to be merged, or would you like me to make any further changes? π
Hi @joelbyler
Thanks a lot for reviewing and confirming that the exception is resolved π
Since this PR addresses the reported issue (#1731) and the separate multi-frame enhancement is now being tracked in #1737, I just wanted to check β is this PR ready to be merged, or would you like me to make any further changes? π
I'm not a maintainer on this project but I think these changes work for my use case
Thanks a lot @joelbyler for confirming that the fix works for your use case π
Hi @omri374 π
Since this PR resolves the reported issue (#1731) and tests are passing locally, could you please review/approve it for merge when you get a chance? π
Hi @omri374 π
Just following up on this PR β since it resolves issue #1731 and tests are passing locally, could you please take a look and let me know if itβs good to merge? π
/azp run
Azure Pipelines successfully started running 1 pipeline(s).
Hi @omri374 π
I noticed that several CI jobs (linting + test_image_redactor) are failing after my changes.
Could you please guide me on what adjustments are needed to get all tests passing? π
Happy to update the PR accordingly.
@joelbyler can you help me with that??
Hi @microsoft/presidio-administrators π
This PR fixes the XA modality multi-frame grayscale DICOM redaction issue (#1731).
All linting/formatting issues have been resolved with ruff, and the tests are passing locally.
Since this PR now mainly waits for code owner review, could you please take a look and approve when you get a chance? π
Thanks a lot!
/azp run
Azure Pipelines successfully started running 1 pipeline(s).
Hi @omri374 π I noticed that several CI jobs (linting + test_image_redactor) are failing after my changes. Could you please guide me on what adjustments are needed to get all tests passing? π Happy to update the PR accordingly.
This PR now has 93 files. It seems to incorporate the anonymizer into the analyzer. Is there a reason for such big PR?
Hi @MRADULTRIPATHI, are you interesting in continuing to work on this PR? is there anything we can assist with?