dicom-anon
dicom-anon copied to clipboard
Support Pixel Anonimizers
Allow plugging in a pixel anonymizer that blacks our the burned in annotations. Ideally it would plug in here and look something like: https://github.com/johnperry/CTP/blob/master/source/files/scripts/DicomPixelAnonymizer.script
Hi, thanks for the suggestion. I don't have much experience with CTP, but I agree an option to plugin a preferred pixel anonymizer would be a nice feature. I think the option could go here right before it cleans out the headers so that we don't destroy data the pixel cleaner needs. Do you have any experience with scripts to do this?
Are you currently using the dicom-anon script?
I am looking to use it. Currently I have a Matlab script that does the anonimization, but I would prefer to move to Python. In my matlab script I blank out the burned in annotations.
Another Python implementation I found is: https://github.com/darcymason/pydicom/blob/dev/pydicom/examples/anonymize.py
You also want to remove the burned annotation and then set burned in to false so that the file does not get quarantined.
I have used that- I basically wrote this script to be a more extensive version of that one.
On Tue, Mar 3, 2015 at 1:12 PM, Alex Rothberg [email protected] wrote:
I am looking to use it. Currently I have a Matlab script that does the anonimization http://www.mathworks.com/help/images/ref/dicomanon.html, but I would prefer to move to Python. In my matlab script I blank out the burned in annotations.
Another Python implementation I found is: https://github.com/darcymason/pydicom/blob/dev/pydicom/examples/anonymize.py
— Reply to this email directly or view it on GitHub https://github.com/chop-dbhi/dicom-anon/issues/3#issuecomment-77001228.
Good Point! I probably won't have time to properly dig into writing a pixel anonymizer in the near-term, but if you have something in MATLAB you would like convert to Python and contribute to the project we welcome any pull requests. I think the hard part is all the heuristics for identifying likely burnt-in data (and making that extendible), which you might already have (and it looks like the CTP script has a good start as well).
It has always been on my wish list to try to use some simple machine learning or OCR to look for text, or at least alert above a certain confidence.
I'd certainly be interested in helping integrate something if you contributed.
It looks like OB
and OW
VRs are being removed here: https://github.com/chop-dbhi/dicom-anon/blob/4a6f06887459e72fb07ba17c28ad2fa4747c74e0/dicom_anon.py#L551 which is the VR
set on pixel data. This means the entire pixel data seems to be removed when "anonymizing".
So you gave me a heart attack on this one, but have tried it and seen it delete the pixel data? I think because of this line in pydicom
https://github.com/darcymason/pydicom/blob/master/source/dicom/_dicom_dict.py#L3706
it actually sets that VR string to "OB or OW" and it fails to match. Assuming this is preventing the problem for you, this is definitely not something it should rely on.
I'm not sure I follow what you are saying.
It looks like the VR
string as presented by pydicom may be: 'OB or OW'
, 'OB'
or 'OW'
.
I have dealt with the issue for now:
def vr_handler(ds, e):
if (e.VR in ['PN', 'CS', 'UI', 'DA', 'DT', 'LT', 'UN', 'UT', 'ST', 'AE', 'LO', 'TM', 'SH', 'AS', 'OB', 'OW'] and
e.tag != PIXEL_DATA):
del ds[e.tag]
return True
return False
Have you seen a situation where pydicom actually puts in the e.VR for the pixel data element the string "OW" or the string "OB"?
My question is that it looks like from file I linked to that PyDICOM sets that string to "OB or OW" so it won't match.
Here is an example from ipython examining a dicom file:
a[0x7fe0, 0x0010].VR
'OW or OB'
Definitely:
In [388]: ds = dicom.read_file("/Users/alex/Downloads/series (1).dcm")
ds[0x7fe0, 0x0010].VR
Out[388]: 'OB'
and after running the file through dcmdjpeg
I see OW
.
Look at that. Thanks for catching that.