presidio
presidio copied to clipboard
Improve performance when DICOM image does not have metadata
Is your feature request related to a problem? Please describe.
When working with DICOM images, there are times when the metadata is already scrubbed. In these cases, the DicomImageRedactorEngine
cannot use metadata to assist in identifying sensitive text PHI burnt-in the image. This leads to the engine having sub-optimal recall.
Describe the solution you'd like
Update the DicomImageRedactorEngine
to be more robust when there is no or minimal metadata present in the DICOM file. This may include having a check for if there is appropriate metadata present (e.g., patient info), and if not, using a specific pre-built analyzer.
Describe alternatives you've considered Alternatives (which could be incorporated) include:
- Using an allow-list instead of deny-list approach, where particular non-sensitive medical terminology could be allowed (e.g., positional terms, scanner metadata) whereas everything else would be automatically flagged as sensitive