MONAI icon indicating copy to clipboard operation
MONAI copied to clipboard

Filter only ".dcm" files when reading DICOM series

Open function2-llx opened this issue 2 years ago • 4 comments
trafficstars

Is your feature request related to a problem? Please describe. When I try to load a DICOM folder, the loading will fail if there is some irrelevant file in the folder (or I have to set force=True). E.g., when I load the DICOM series from the LIDC-IDRI database, there's a .xml annotation file in the folder, the loader will try to load it as well and cause failure.

Describe the solution you'd like Change this line https://github.com/Project-MONAI/MONAI/blob/2cbed6cfa7a007fa8853a7bd8cf09303172686c9/monai/data/image_reader.py#L469 to something like

series_slcs = Path(name).glob("*.dcm")

I'm not sure if there are other suffixes to be considered.

function2-llx avatar Jun 19 '23 14:06 function2-llx

Hi @function2-llx , thanks for the suggestion, but I have seen some scanners exporting DICOM without the ".dcm" suffix, e.g. "IM_0001" on some ultrasound systems. It may be better to keep it optional.

mingxin-zheng avatar Jun 19 '23 15:06 mingxin-zheng

@mingxin-zheng Yes, maybe we can let the user decide whether to filter and what suffix to filter.

function2-llx avatar Jun 19 '23 17:06 function2-llx

I tagged this as a feature request and see if this is a generalized usage. Besides using the suffix to filter files, there are also many other possible name filters users want to customize. I am not sure whether a filter on suffix or something more general is needed.

For now, as I see the read method take list of files, you may try to glob the .dcm files as a list and pass it to the reader in your script.

https://github.com/Project-MONAI/MONAI/blob/2cbed6cfa7a007fa8853a7bd8cf09303172686c9/monai/data/image_reader.py#L441

import monai
import glob
files = glob.glob("dcm/*.DCM")
imgs = monai.data.PydicomReader().read(p)

If we go for a filter later, maybe we can allow a regex pattern here for file inclusion/exclusion in the pydicom reader.

https://github.com/Project-MONAI/MONAI/blob/2cbed6cfa7a007fa8853a7bd8cf09303172686c9/monai/data/image_reader.py#L470

mingxin-zheng avatar Jun 21 '23 06:06 mingxin-zheng

Thanks, I feel like using a regex pattern is a good idea.

On the other hand, reading a list of files may not work as expected because they will be stacked along the channel dimension (even permutation after that is not feasible since the affine compatibility check will fail).

https://github.com/Project-MONAI/MONAI/blob/2cbed6cfa7a007fa8853a7bd8cf09303172686c9/monai/data/image_reader.py#L443-L444

https://github.com/Project-MONAI/MONAI/blob/2cbed6cfa7a007fa8853a7bd8cf09303172686c9/monai/data/image_reader.py#L127-L131

function2-llx avatar Jun 21 '23 11:06 function2-llx