heudiconv icon indicating copy to clipboard operation
heudiconv copied to clipboard

Failure when DICOM data does not have attribute 'AcquisitionDate'

Open tjhendrickson opened this issue 3 years ago • 11 comments

Summary

log
INFO: Doing conversion using dcm2niix
INFO: Converting /output_dir/sub-michaelmyers/ses-112921/anat/sub-michaelmyers_ses-112921_rec-normalized_run-01_T2w (208 DICOMs) -> /output_dir/sub-michaelmyers/ses-112921/anat . Converter: dcm2niix . Output types: ('nii.gz', 'dicom')
211207-14:57:18,631 nipype.workflow INFO:
	 [Node] Setting-up "convert" in "/tmp/dcm2niixqka1kfvu/convert".
INFO: [Node] Setting-up "convert" in "/tmp/dcm2niixqka1kfvu/convert".
211207-14:57:18,704 nipype.workflow INFO:
	 [Node] Running "convert" ("nipype.interfaces.dcm2nii.Dcm2niix"), a CommandLine Interface with command:
dcm2niix -b y -z y -x n -t n -m n -f sub-michaelmyers_ses-112921_rec-normalized_run-01_T2w_heudiconv770 -o /output_dir/sub-michaelmyers/ses-112921/anat -s n -v n /tmp/dcm2niixqka1kfvu/convert
INFO: [Node] Running "convert" ("nipype.interfaces.dcm2nii.Dcm2niix"), a CommandLine Interface with command:
dcm2niix -b y -z y -x n -t n -m n -f sub-michaelmyers_ses-112921_rec-normalized_run-01_T2w_heudiconv770 -o /output_dir/sub-michaelmyers/ses-112921/anat -s n -v n /tmp/dcm2niixqka1kfvu/convert
211207-14:57:20,239 nipype.interface INFO:
	 stdout 2021-12-07T14:57:20.239657:Chris Rorden's dcm2niiX version v1.0.20210317  GCC9.3.0 x86-64 (64-bit Linux)
INFO: stdout 2021-12-07T14:57:20.239657:Chris Rorden's dcm2niiX version v1.0.20210317  GCC9.3.0 x86-64 (64-bit Linux)
211207-14:57:20,240 nipype.interface INFO:
	 stdout 2021-12-07T14:57:20.239657:Found 208 DICOM file(s)
INFO: stdout 2021-12-07T14:57:20.239657:Found 208 DICOM file(s)
211207-14:57:20,240 nipype.interface INFO:
	 stdout 2021-12-07T14:57:20.239657:Convert 208 DICOM as /output_dir/sub-michaelmyers/ses-112921/anat/sub-michaelmyers_ses-112921_rec-normalized_run-01_T2w_heudiconv770 (300x320x208x1)
INFO: stdout 2021-12-07T14:57:20.239657:Convert 208 DICOM as /output_dir/sub-michaelmyers/ses-112921/anat/sub-michaelmyers_ses-112921_rec-normalized_run-01_T2w_heudiconv770 (300x320x208x1)
211207-14:57:25,791 nipype.interface INFO:
	 stdout 2021-12-07T14:57:25.791262:Conversion required 6.812600 seconds (5.249844 for core code).
INFO: stdout 2021-12-07T14:57:25.791262:Conversion required 6.812600 seconds (5.249844 for core code).
211207-14:57:25,830 nipype.workflow INFO:
	 [Node] Finished "convert".
INFO: [Node] Finished "convert".
WARNING: Failed to get date/time for the content: 'FileDataset' object has no attribute 'AcquisitionDate'
Traceback (most recent call last):
  File "/usr/local/miniconda/envs/heudiconv/bin/heudiconv", line 8, in <module>
    sys.exit(main())
  File "/usr/local/miniconda/envs/heudiconv/lib/python3.9/site-packages/heudiconv/cli/run.py", line 24, in main
    workflow(**kwargs)
  File "/usr/local/miniconda/envs/heudiconv/lib/python3.9/site-packages/heudiconv/main.py", line 337, in workflow
    prep_conversion(sid,
  File "/usr/local/miniconda/envs/heudiconv/lib/python3.9/site-packages/heudiconv/convert.py", line 207, in prep_conversion
    convert(cinfo,
  File "/usr/local/miniconda/envs/heudiconv/lib/python3.9/site-packages/heudiconv/convert.py", line 462, in convert
    convert_dicom(item_dicoms, bids_options, prefix,
  File "/usr/local/miniconda/envs/heudiconv/lib/python3.9/site-packages/heudiconv/convert.py", line 561, in convert_dicom
    compress_dicoms(item_dicoms,
  File "/usr/local/miniconda/envs/heudiconv/lib/python3.9/site-packages/heudiconv/dicoms.py", line 361, in compress_dicoms
    dcm_time = get_dicom_series_time(dicom_list)
  File "/usr/local/miniconda/envs/heudiconv/lib/python3.9/site-packages/heudiconv/dicoms.py", line 319, in get_dicom_series_time
    dcm_date = dicom.SeriesDate  # YYYYMMDD
  File "/usr/local/miniconda/envs/heudiconv/lib/python3.9/site-packages/pydicom/dataset.py", line 835, in __getattr__
    return object.__getattribute__(self, name)
- heuristic
--> import pdb

Heuristic
def create_key(template, outtype=('nii.gz','dicom'), annotation_classes=None): #), annotation_classes=None):
    if template is None or not template:
        raise ValueError('Template must be a valid format string')
    return (template, outtype, annotation_classes)


def infotodict(seqinfo):
    """Heuristic evaluator for determining which runs belong where

    allowed template fields - follow python string module:

    item: index within category
    subject: participant id
    seqitem: run number during scanning
    subindex: sub index within group
    """ 
    t1 = create_key('sub-{subject}/{session}/anat/sub-{subject}_{session}_rec-{rec}_run-{item:02d}_T1w')
    t2 = create_key('sub-{subject}/{session}/anat/sub-{subject}_{session}_rec-{rec}_run-{item:02d}_T2w')

    rest_nordic_2mm_bold = create_key('sub-{subject}/{session}/func/sub-{subject}_{session}_task-rest2mm_acq-NORDIC_run-{item:02d}_part-{part}_bold')
    rest_nordic_2mm_sbref = create_key('sub-{subject}/{session}/func/sub-{subject}_{session}_task-rest2mm_acq-NORDIC_run-{item:02d}_sbref')
    rest_nordic_2mm_physio = create_key('sub-{subject}/{session}/func/sub-{subject}_{session}_task-rest2mm_acq-NORDIC_run-{item:02d}_bold')
    
    rest_2p4mm_MB4_TR1200_bold = create_key('sub-{subject}/{session}/func/sub-{subject}_{session}_task-rest2p4mm_acq-tr1200mb4_run-{item:02d}_bold')
    rest_2p4mm_MB4_TR1200_sbref = create_key('sub-{subject}/{session}/func/sub-{subject}_{session}_task-rest2p4mm_acq-tr1200mb4_run-{item:02d}_sbref')
    rest_2p4mm_MB4_TR1200_physio = create_key('sub-{subject}/{session}/func/sub-{subject}_{session}_task-rest2p4mm_acq-tr1200mb4_run-{item:02d}_physio')

    rest_2p4mm_MB6_TR820_bold = create_key('sub-{subject}/{session}/func/sub-{subject}_{session}_task-rest2p4mm_acq-tr820mb6_run-{item:02d}_bold')
    rest_2p4mm_MB6_TR820_sbref = create_key('sub-{subject}/{session}/func/sub-{subject}_{session}_task-rest2p4mm_acq-tr820mb6_run-{item:02d}_sbref')
    rest_2p4mm_MB6_TR820_physio = create_key('sub-{subject}/{session}/func/sub-{subject}_{session}_task-rest2p4mm_acq-tr820mb6_run-{item:02d}_physio')
    
    rest_2mm_MB4_TR1510_bold = create_key('sub-{subject}/{session}/func/sub-{subject}_{session}_task-rest2mm_acq-tr1510mb4_run-{item:02d}_bold')
    rest_2mm_MB4_TR1510_sbref = create_key('sub-{subject}/{session}/func/sub-{subject}_{session}_task-rest2mm_acq-tr1510mb4_run-{item:02d}_sbref')
    rest_2mm_MB4_TR1510_physio = create_key('sub-{subject}/{session}/func/sub-{subject}_{session}_task-rest2mm_acq-tr1510mb4_run-{item:02d}_physio')
    
    rest_2mm_MB6_TR1030_bold = create_key('sub-{subject}/{session}/func/sub-{subject}_{session}_task-rest2mm_acq-tr1030mb6_run-{item:02d}_bold')
    rest_2mm_MB6_TR1030_sbref = create_key('sub-{subject}/{session}/func/sub-{subject}_{session}_task-rest2mm_acq-tr1030mb6_run-{item:02d}_sbref')
    rest_2mm_MB6_TR1030_physio = create_key('sub-{subject}/{session}/func/sub-{subject}_{session}_task-rest2mm_acq-tr1030mb6_run-{item:02d}_physio')
    
    spinecho_fieldmap_bold = create_key('sub-{subject}/{session}/fmap/sub-{subject}_{session}_acq-SpinEcho_dir-{dir}_run-{item:02d}_epi')
    spinecho_fieldmap_physio = create_key('sub-{subject}/{session}/fmap/sub-{subject}_{session}_acq-SpinEcho_dir-{dir}_run-{item:02d}_physio')
    
    spinecho_fieldmap_2mm_bold = create_key('sub-{subject}/{session}/fmap/sub-{subject}_{session}_acq-SpinEcho2mm_dir-{dir}_run-{item:02d}_epi')
    spinecho_fieldmap_2mm_physio = create_key('sub-{subject}/{session}/fmap/sub-{subject}_{session}_acq-SpinEcho2mm_dir-{dir}_run-{item:02d}_physio')
    
    spinecho_fieldmap_2p4mm_bold = create_key('sub-{subject}/{session}/fmap/sub-{subject}_{session}_acq-SpinEcho2p4mm_dir-{dir}_run-{item:02d}_epi')
    spinecho_fieldmap_2p4mm_physio = create_key('sub-{subject}/{session}/fmap/sub-{subject}_{session}_acq-SpinEcho2p4mm_dir-{dir}_run-{item:02d}_physio')
    
    spinecho_fieldmap_ME_bold = create_key('sub-{subject}/{session}/fmap/sub-{subject}_{session}_acq-SpinEchoME_dir-{dir}_run-{item:02d}_epi')
    spinecho_fieldmap_ME_physio = create_key('sub-{subject}/{session}/fmap/sub-{subject}_{session}_acq-SpinEchoME_dir-{dir}_run-{item:02d}_physio')
    

    info = {t1: [], t2: [], rest_nordic_2mm_bold: [], rest_nordic_2mm_sbref: [],
            rest_nordic_2mm_physio: [], rest_2p4mm_MB4_TR1200_bold: [], rest_2p4mm_MB4_TR1200_sbref: [],
            rest_2p4mm_MB4_TR1200_physio: [], rest_2p4mm_MB6_TR820_bold: [], rest_2p4mm_MB6_TR820_sbref: [],
            rest_2p4mm_MB6_TR820_physio: [], rest_2mm_MB4_TR1510_bold: [], rest_2mm_MB4_TR1510_sbref: [],
            rest_2mm_MB4_TR1510_physio: [], rest_2mm_MB6_TR1030_bold: [], rest_2mm_MB6_TR1030_sbref: [],
            rest_2mm_MB6_TR1030_physio: [], spinecho_fieldmap_bold: [], spinecho_fieldmap_physio: [],
            spinecho_fieldmap_2mm_bold: [], spinecho_fieldmap_2mm_physio: [], spinecho_fieldmap_2p4mm_bold: [],
            spinecho_fieldmap_2p4mm_physio: [], spinecho_fieldmap_ME_bold: [], spinecho_fieldmap_ME_physio: []
            }

    for idx, s in enumerate(seqinfo):
        # retreive previous element in seqinfo
        if idx > 0:
            s_previous = seqinfo[idx-1]
        if idx - 1 > 0:
            s_previous_two = seqinfo[idx-2]
        # retreive next element in seqinfo
        if idx + 1 < len(seqinfo):
            s_next = seqinfo[idx+1]
        # retreive next next element in seqinfo
        if idx + 2 < len(seqinfo):
            s_next_two = seqinfo[idx+2]

        # find pre scan normalized anatomicals
        if (s.dim3 == 208) and ('NORM' in s.image_type):
            if 'T1w_MPR' in s.series_description:
                rec = 'normalized'
                info[t1].append({'item': s.series_id, 'rec': rec})
            elif 'T2w' in s.series_description:
                rec = 'normalized'
                info[t2].append({'item': s.series_id, 'rec': rec})
        # find resting state scans. Differentiate by mag or phase
        elif (s.dim4 > 5) and ('rest' in s.series_description):
            if 'NORDIC' in s.protocol_name:
                if s.image_type[2] == 'M':
                    if (s_next.dim4 > 5) and ('NORDIC' in s_next.series_description) and (s_next.image_type[2] == 'P'):
                        part = 'mag'
                        info[rest_nordic_2mm_bold].append({'item': s.series_id,'part': part})
                elif s.image_type[2] == 'P':
                    if (s_previous.dim4 > 5) and ('NORDIC' in s_previous.series_description) and (s_previous.image_type[2] == 'M'):
                        part = 'phase'
                        info[rest_nordic_2mm_bold].append({'item': s.series_id,'part': part})
            elif 'rest' in s.protocol_name:
                if 'MB4' in s.protocol_name: # MB 4 resting state scans
                    if s.TR == 1.2: # TR 1200ms
                        info[rest_2p4mm_MB4_TR1200_bold].append({'item': s.series_id})
                    elif s.TR == 1.51: #TR 1510ms
                        info[rest_2mm_MB4_TR1510_bold].append({'item': s.series_id})
                elif 'MB6' in s.protocol_name: # MB 6 resting state scans
                    if s.TR == 0.82: #TR 820ms
                         info[rest_2p4mm_MB6_TR820_bold].append({'item': s.series_id})
                    elif s.TR == 1.03: #TR 1030ms
                        info[rest_2mm_MB6_TR1030_bold].append({'item': s.series_id})
                
        # retreive field maps
        elif 'SpinEchoFieldMap' in s.series_description and s.dim4 == 3:
            if '2.0' in s.protocol_name:
                if 'AP' in s.series_description:
                    info[spinecho_fieldmap_2mm_bold].append({'item': s.series_id, 'dir': 'AP'})
                elif 'PA' in s.series_description:
                    info[spinecho_fieldmap_2mm_bold].append({'item': s.series_id, 'dir': 'PA'})
            elif '2.4' in s.protocol_name:
                if 'AP' in s.series_description:
                    info[spinecho_fieldmap_2p4mm_bold].append({'item': s.series_id, 'dir': 'AP'})
                elif 'PA' in s.series_description:
                    info[spinecho_fieldmap_2p4mm_bold].append({'item': s.series_id, 'dir': 'PA'})
            elif 'forME' in s.series_description:
                if 'AP' in s.series_description:
                    info[spinecho_fieldmap_ME_bold].append({'item': s.series_id, 'dir': 'AP'})
                elif 'PA' in s.series_description:
                    info[spinecho_fieldmap_ME_bold].append({'item': s.series_id, 'dir': 'PA'})
            else:
                if 'AP' in s.series_description:
                    info[spinecho_fieldmap_bold].append({'item': s.series_id, 'dir': 'AP'})
                elif 'PA' in s.series_description:
                    info[spinecho_fieldmap_bold].append({'item': s.series_id, 'dir': 'PA'})
                
        # retreive sbref images
        elif 'SBRef' in s.series_description:
            if 'NORDIC' in s.protocol_name:
                if (s_next.dim4 > 5) and ('NORDIC' in s_next.series_description) \
                    and (s_next.image_type[2] == 'M') and (s_next_two.dim4 > 5) \
                    and ('NORDIC' in s_next_two.series_description) \
                    and (s_next_two.image_type[2] == 'P'):
                    info[rest_nordic_2mm_sbref].append({'item': s.series_id})
            elif 'rest' in s.protocol_name:
                if 'MB4' in s.protocol_name: # MB 4 resting state scans
                    if 'MB4' in s_next.protocol_name and s_next.TR == 1.2: # TR 1200ms
                        info[rest_2p4mm_MB4_TR1200_sbref].append({'item': s.series_id})
                    elif 'MB4' in s_next.protocol_name and s_next.TR == 1.51: #TR 1510ms
                        info[rest_2mm_MB4_TR1510_sbref].append({'item': s.series_id})
                elif 'MB6' in s.protocol_name: # MB 6 resting state scans
                    if 'MB6' in s_next.protocol_name and s_next.TR == 0.82: #TR 820ms
                         info[rest_2p4mm_MB6_TR820_sbref].append({'item': s.series_id})
                    elif 'MB6' in s_next.protocol_name and s_next.TR == 1.03: #TR 1030ms
                        info[rest_2mm_MB6_TR1030_sbref].append({'item': s.series_id})
                
        """                                        
        # retreive physiological recordings
        elif 'PhysioLog' in s.dcm_dir_name:
            if 'rest' in s.series_description:
                if '10MIN' in s.protocol_name:
                    if (s_previous_two.dim4 > 5) and ('10MIN' in s_previous_two.series_description) \
                    and (s_previous_two.image_type[2] == 'M') and (s_previous.dim4 > 5) \
                    and ('10MIN' in s_previous.series_description) \
                    and (s_previous.image_type[2] == 'P'):
                        info[rest_ten_minute_physio].append({'item': s.series_id})
                elif '16MIN' in s.protocol_name:
                    if (s_previous_two.dim4 > 5) and ('16MIN' in s_previous_two.series_description) \
                    and (s_previous_two.image_type[2] == 'M') and (s_previous.dim4 > 5) \
                    and ('16MIN' in s_previous.series_description) \
                    and (s_previous.image_type[2] == 'P'):
                        info[rest_sixteen_minute_physio].append({'item': s.series_id})
            elif 'GEFieldMap' in s.series_description:
                if 'AP' in s.series_description:
                    if s_previous_two.image_type[2] == 'M' and (s_previous_two.dim4 >= 15) and ('GEFieldMap' in s_previous_two.series_description) and (s_previous.dim4 >= 15) and ('GEFieldMap' in s_previous.series_description) and (s_previous.image_type[2] == 'P'):
                        info[gradientecho_fieldmap_ME_bold_physio].append({'item': s.series_id, 'dir': 'AP'})
                elif 'PA' in s.series_description:
                    if s_previous_two.image_type[2] == 'M' and (s_previous_two.dim4 >= 15) and ('GEFieldMap' in s_previous_two.series_description) and (s_previous.dim4 >= 15) and ('GEFieldMap' in s_previous.series_description) and (s_previous.image_type[2] == 'P'):
                        info[gradientecho_fieldmap_ME_bold_physio].append({'item': s.series_id, 'dir': 'PA'})
        """
    return info

NB edited by @yarikoptic to provide collapsed details above to ease digestion

Platform details:

Choose one:

  • [ ] Local environment
  • [ X] Container
  • Heudiconv version:

0.9.0

tjhendrickson avatar Dec 07 '21 21:12 tjhendrickson

It looks like the AcquisitionDate is optional, while StudyDate is mandatory.

AcquisitionDate is read here: https://github.com/nipy/heudiconv/blob/0e0f9e33a3ad4dc9bdce33ff7da12ed660e9abd9/heudiconv/dicoms.py#L91

@yarikoptic , Should we modify that line to read:

date=dcminfo.get('AcquisitionDate') or dcminfo.get('StudyDate'),

?

pvelasco avatar Dec 08 '21 18:12 pvelasco

Another comment on the particular DICOM data that I am dealing with is that identifying information such acquisition and study date has been stripped, so having a feature that allows for "date" to be blank would be helpful.

tjhendrickson avatar Dec 08 '21 18:12 tjhendrickson

It looks like the AcquisitionDate is optional, while StudyDate is mandatory.

hm... may be for that particular line of use it would work indeed although I guess "StudyDate" would be the same for all acquisitions, so acquisitions at the day boundary might have errorneously the same date, which might bring confusion. But that is only the first step in that function (get_dicom_series_time) which whole purpose is to provide series_time, so it will also look at SeriesTime, and those I guess also would be the same for the same study... indeed it does

(git-annex)lena:~/datalad/dbic/QA[master]sourcedata/sub-qa/ses-20191218/func
$> dcmdump sub-qa_ses-20191218_task-rest_acq-p2_bold/000001.dcm| grep -e '\(Series\|Study\)Time'
(0008,0030) TM [091257.428000]                          #  14, 1 StudyTime
(0008,0031) TM [092559.974000]                          #  14, 1 SeriesTime

$> dcmdump sub-qa_ses-20191218_task-rest_acq-p2Xs4X35mm_bold/000001.dcm| grep -e '\(Series\|Study\)Time'
(0008,0030) TM [091257.428000]                          #  14, 1 StudyTime
(0008,0031) TM [093300.045000]                          #  14, 1 SeriesTime

so even though code would "work" it IMHO would misinform the user and heuristic would not be able to make use of it anyways. so may be it would be better just to return None and make sure that any subsequent code path could handle it?

yarikoptic avatar Dec 08 '21 19:12 yarikoptic

As a feature, returning None if something doesn't exist makes sense to me.

tjhendrickson avatar Dec 08 '21 19:12 tjhendrickson

What scanner is that? do you have some smallish dicom(s) which you could share (e.g. on phantom) so we could fix/test?

yarikoptic avatar Dec 08 '21 20:12 yarikoptic

@yarikoptic, let me speak to the owners of the data whether there are any data use certifications/authentications at play. I'll be in touch!

tjhendrickson avatar Dec 08 '21 21:12 tjhendrickson

@yarikoptic, I was able to get some phantom data from the same scanner and with the same de-identification routines applied.

Attached are DICOM folders representing three localizers. Let me know if you have any questions! phantom.zip

tjhendrickson avatar Dec 22 '21 16:12 tjhendrickson

Hi @yarikoptic, just checking in on the status of this.

Best,

-Tim

tjhendrickson avatar Jan 18 '22 20:01 tjhendrickson

Using @tjhendrickson's data, the only place where the code failed was not where I mentioned above, but in https://github.com/nipy/heudiconv/blob/0e0f9e33a3ad4dc9bdce33ff7da12ed660e9abd9/heudiconv/dicoms.py#L319 (in get_dicom_series_time).

As far as I can tell, this function is used to return the time a particular dicom series was acquired, to set that time in the dicom tarballs, to ensure reproducibility.

So, if the SeriesDate and SeriesTime are not present, maybe we can use the StudyDate and StudyTime, which will ensure reproducibility of the tarballs? Because the code needs a time, returning None in get_dicom_series_time would be a little bit messy.

The line I referenced above (https://github.com/nipy/heudiconv/blob/0e0f9e33a3ad4dc9bdce33ff7da12ed660e9abd9/heudiconv/dicoms.py#L91) seems to return None, but it doesn't give an error. When creating the dicominfo.tsv with the SeqInfo, it enters None for the date. The same applies if the time is not present.

If you are OK with using StudyDate and StudyTime for get_dicom_series_time, I have a PR ready.

pvelasco avatar Jan 21 '22 17:01 pvelasco

The data that we have is de-identified so it is entirely likely that no dates will be present within the DICOM header, as those represent PHI. Is it doable to program this up so that it doesn't require any dates present in the DICOM header?

Best

-Tim

Message ID: @.***>

tjhendrickson avatar Feb 19 '22 03:02 tjhendrickson

Just checking up on this.

Best,

-Tim

tjhendrickson avatar Mar 14 '22 18:03 tjhendrickson