deid
deid copied to clipboard
How to de-identify a `pydicom.Dataset`? - addition of example to docs
Hi,
I have implemented my own logic to load a pydicom.Dataset
instance from a database. I would like to de-identify the instance without having to write it as a file and then read it with deid
.
Is there anything similar to
def replace_identifiers(recipe, dataset: pydicom.dataset.Dataset) -> pydicom.dataset.Dataset:
"""de-identify a single pydicom.dataset.Dataset instance"""
...
?
Aside from adding typing to deid here, you should be able to do:
if isinstance(dataset, pydicom.dataset.Dataset):
replace_identifiers(...)
Also, typing in and of itself doesn't prevent you from providing the wrong type! E.g.,:
In [1]: def func(name: str):
...: print(name)
...:
In [2]: func(1)
1
After some more digging through the documentation, I solved my problem with the following:
class DeidDataset:
def __init__(self, recipe_path: str = None):
"""Deidentify datasets according to vaib recipe
:param recipe_path: path to the deid recipe
"""
if recipe_path == None:
logging.warning(f"DeidDataset using default recipe {default_recipe_path}")
recipe_path = default_recipe_path
self.recipe = DeidRecipe(recipe_path)
def anonymize(self, dataset:pydicom.Dataset) -> pydicom.Dataset:
"""Anonymize a single dicom dataset
:param dataset: dataset that will be anonymized
:returns: anonymized dataset
"""
parser = DicomParser(dataset, self.recipe)
parser.define('remove_day', self.remove_day)
parser.define('round_AS_to_nearest_5y', self.round_AS_to_nearest_5y)
parser.define('round_DS_to_nearest_5', self.round_DS_to_nearest_5)
parser.define('round_DS_to_nearest_0_05', self.round_DS_to_nearest_0_05)
parser.parse(strip_sequences=True, remove_private=True)
return parser.dicom
...
Thanks for making this tool available.
oh that's fantastic! Do you mind if I include with our docs somewhere as an example? Even if we create a gist and then link, I think it might be super helpful for future users.
Of course! I will be OOO for the next two weeks. If you can wait that time, I will make a proper PR afterwards adding the example to the docs.
Actually, I just found that this only works for files, there are two lines that must be silenced in order for it to work with a dataset that doesn't come from a file:
https://github.com/pydicom/deid/blob/0807f20bfc36b1f30828ed562c7f79e14b5f6100/deid/dicom/parser.py#L114
https://github.com/pydicom/deid/blob/0807f20bfc36b1f30828ed562c7f79e14b5f6100/deid/dicom/parser.py#L115
and the file meta here:
https://github.com/pydicom/deid/blob/0807f20bfc36b1f30828ed562c7f79e14b5f6100/deid/dicom/fields.py#L254
Yes of course! When you are back ping me I’d you have questions or want any help.
I'm back 😄
I will need to expose an argument to be able to silence these two lines.
Actually, I just found that this only works for files, there are two lines that must be silenced in order for it to work with a dataset that doesn't come from a file:
https://github.com/pydicom/deid/blob/0807f20bfc36b1f30828ed562c7f79e14b5f6100/deid/dicom/parser.py#L114
https://github.com/pydicom/deid/blob/0807f20bfc36b1f30828ed562c7f79e14b5f6100/deid/dicom/parser.py#L115
I propose adding a boolean from_file
argument to the __init__
method of DicomParser
and then using if-else statements to silence the lines accordingly.
For the DicomField
part that needs to be silenced:
and the file meta here:
https://github.com/pydicom/deid/blob/0807f20bfc36b1f30828ed562c7f79e14b5f6100/deid/dicom/fields.py#L254
I can add the same argument to this method and skip the dicom.file_meta
part accordingly.
https://github.com/pydicom/deid/blob/0807f20bfc36b1f30828ed562c7f79e14b5f6100/deid/dicom/fields.py#L240
To keep the interface intact, the default value for the proposed arguments would be True and only if it is necessary, the user could set it to False when needed.
Does this sound good to you @vsoch ?
A dataset that doesn’t come from a file - what would it be?
I am loading a dataset that was stored in a database as json. Therefore it contains no filepath or file_meta.
Gotcha, ok just make functions to derive both of those items then, and if you cannot set to None, and make sure places that use them also respond appropriately.