dicom-numpy ENH proposals from contrib-pydicom: read a folder, use proper dtype for rescaled images

ENH proposals from contrib-pydicom: read a folder, use proper dtype for rescaled images

Open jond01 opened this issue 3 years ago • 1 comments

Hi, I have recently reviewed the contrib-pydicom code. The script input-output/pydicom_series.py there has (more or less) the same purpose as dicom-numpy. I noticed there two points that we may want to include also here:

Read a folder: add an API to read a folder and extract the image array and affine. A similar idea is in the example from the docs, which receives a list of files:
```
import pydicom
import dicom_numpy

def extract_voxel_data(list_of_dicom_files):
    datasets = [pydicom.dcmread(f) for f in list_of_dicom_files]
    try:
        voxel_ndarray, ijk_to_xyz = dicom_numpy.combine_slices(datasets)
    except dicom_numpy.DicomImportException as e:
        # invalid DICOM data
        raise
    return voxel_ndarray
```
We may go one level up, and receive only the path of the folder containing these files. The files within the folder can be filtered to only dicoms with pydicom's built-in is_dicom function, and further split into distinct series (according to the SeriesInstanceUID). combine_slices will be called for each series, and the returned data will be a list of [(voxels0, affine0), (voxels1, affine1), ...].
Tighten the dtype of rescaled images: currently, dicom-numpy uses np.float32 every time there is a RescaleSlope or RescaleIntercept: https://github.com/innolitics/dicom-numpy/blob/204e95594a527bbab1444f9248432ffa01af024c/dicom_numpy/combine_slices.py#L108 However, many times it is not necessary - for example, if the rescale slope is 1.0 and the rescale intercept is -1024.0, the image can still be of an integer dtype. contrib-pydicom seems to have some clever way to determine the proper dtype.

May 05 '21 14:05 jond01

@jond01 thank you for the thoughtful suggestions. I agree that (1) would be a useful function to add.

I also agree that we don't want to force users to convert to np.float32 if they don't want to. I think it would be nice for function users to be able to assume the output of combine_slices has a consistent dtype, regardless of the input. Thus, I don't think the default behavior should be able to vary between dtypes dynamically. I'm sure there is a way to accommodate both requirements though.

May 07 '21 17:05 johndgiese

dicom-numpy dicom-numpy copied to clipboard

ENH proposals from contrib-pydicom: read a folder, use proper dtype for rescaled images

dicom-numpy
dicom-numpy copied to clipboard