fmriprep Issue preprocessing multiple multi echo / multi band data with fmriprep update

What happened?

We updated fMRIPrep to version 21.0.0. We have been using fMRIPrep for a few years now, and we have preprocessed multiple datasets with this pipeline. After making the upgrade, we wanted to re preprocess 2 of the datasets. We were able to re-preprocess the first no problem (NOT multi echo data). We went to preprocess the 2nd dataset - multiband/multiecho - and we can't preprocess more than 1 participant at a time. As I mentioned, these data have been preprocessed with fMRIPrep in the past no problem; I could run 10 at a time. (All work directories have been deleted, 'old' data have been moved out of the fMRIPrep path.) I am testing with 2-3 participants at a time, and have tried the following flags: --low-mem --mem-mb55000 --use-plugin plugin.yml (made a corresponding file - plugin: LegacyMultiProc plugin_args: {maxtasksperchild: 1, memory_gb: 60, n_procs: 10, raise_insufficient: false}

Without the flags, and with the '--low-mem' flag, I get a 'broken process pull' and fMRIPrep doesn't finish. With the other 2 flags, I get a Node Name error: fmriprep_wf.single_subject_3156_wf.func_preproc_ses_1_task.... _ echo_1_wf.bold_t2smap_wf.t2nmap_node (full error below) I get this node error for each participant, different number of nodes. If I run the participants separately, they run no problem.

What command did you use?

singularity run --cleanenv -B /data/MNA/data /data/MNA/fmriprep_21.0.0.simg /data/MNA/data/Nifti /data/MNA/data/Preprocessed participant --participant_label sub-3158 3159 3162 --ignore slicetiming sbref --fs-license-file /home/nicoler/license.txt

('original' command used. I also tried the various flags I outlined above)

What version of fMRIPrep are you running?

21.0.0

How are you running fMRIPrep?

Singularity

Is your data BIDS valid?

Yes

Are you reusing any previously computed results?

No

Please copy and paste any relevant log output.

When fMRIPrep doesn't finish/just hangs error is as follows: 
ile "/opt/conda/lib/python3.8/concurrent/futures/_base.py", line 437, in result
   return self.__get_result()
File "/opt/conda/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result raise         self._exception
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending. 

When it runs, example of the node error: 
File: /data/MNA/data/Preprocessed/sub-3158/log/20220301-105737_4b639652-12b4-452d-9125-1d5c84795b5b/crash-20220301-135236-nicoler-t2smap_node-473aa1f0-7183-40bb-b5f9-29a4a3e11102.txt
Working Directory: /data/MNA/data/work/fmriprep_wf/single_subject_3158_wf/func_preproc_ses_1_task_sop_run_001_echo_1_wf/bold_t2smap_wf/t2smap_node
Inputs:

    args:
    echo_times: [0.0132, 0.03119, 0.04918]
    environ: {}
    fittype: curvefit
    in_files:
    mask_file:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/site-packages/nipype/pipeline/plugins/legacymultiproc.py", line 67, in run_node
    result["result"] = node.run(updatehash=updatehash)
  File "/opt/conda/lib/python3.8/site-packages/nipype/pipeline/engine/nodes.py", line 516, in run
    result = self._run_interface(execute=True)
  File "/opt/conda/lib/python3.8/site-packages/nipype/pipeline/engine/nodes.py", line 635, in _run_interface
    return self._run_command(execute)
  File "/opt/conda/lib/python3.8/site-packages/nipype/pipeline/engine/nodes.py", line 741, in _run_command
    result = self._interface.run(cwd=outdir)
  File "/opt/conda/lib/python3.8/site-packages/nipype/interfaces/base/core.py", line 428, in run
    runtime = self._run_interface(runtime)
  File "/opt/conda/lib/python3.8/site-packages/nipype/interfaces/base/core.py", line 822, in _run_interface
self.raise_exception(runtime)
  File "/opt/conda/lib/python3.8/site-packages/nipype/interfaces/base/core.py", line 749, in raise_exception
    raise RuntimeError(
RuntimeError: Command:
t2smap -d /data/MNA/data/work/fmriprep_wf/single_subject_3158_wf/func_preproc_ses_1_task_sop_run_001_echo_1_wf/unwarp_wf/_echoidx_0/merge/vol0000_unwarped_merged.nii.gz /data/MNA/data/work/fmriprep_wf/single_subject_3158_wf/func_preproc_ses_1_task_sop_run_001_echo_1_wf/unwarp_wf/_echoidx_1/merge/vol0000_unwarped_merged.nii.gz /data/MNA/data/work/fmriprep_wf/single_subject_3158_wf/func_preproc_ses_1_task_sop_run_001_echo_1_wf/unwarp_wf/_echoidx_2/merge/vol0000_unwarped_merged.nii.gz -e 13.2 31.189999999999998 49.18 --mask /data/MNA/data/work/fmriprep_wf/single_subject_3158_wf/func_preproc_ses_1_task_sop_run_001_echo_1_wf/unwarp_wf/brainextraction_wf/_echoidx_0/masker/clipped_mask.nii.gz --fittype curvefit
Standard output:

Standard error:
INFO     t2smap:t2smap_workflow:229 Using output directory: /data/MNA/data/work/fmriprep_wf/single_subject_3158_wf/func_preproc_ses_1_task_sop_run_001_echo_1_wf/bold_t2smap_wf/t2smap_node
INFO     t2smap:t2smap_workflow:239 Loading input data: ['/data/MNA/data/work/fmriprep_wf/single_subject_3158_wf/func_preproc_ses_1_task_sop_run_001_echo_1_wf/unwarp_wf/_echoidx_0/merge/vol0000_unwarped_merged.nii.gz', '/data/MNA/data/work/fmriprep_wf/single_subject_3158_wf/func_preproc_ses_1_task_sop_run_001_echo_1_wf/unwarp_wf/_echoidx_1/merge/vol0000_unwarped_merged.nii.gz', '/data/MNA/data/work/fmriprep_wf/single_subject_3158_wf/func_preproc_ses_1_task_sop_run_001_echo_1_wf/unwarp_wf/_echoidx_2/merge/vol0000_unwarped_merged.nii.gz']
INFO     t2smap:t2smap_workflow:255 Using user-defined mask
INFO     t2smap:t2smap_workflow:258 Computing adaptive T2* map
Killed
Return code: 137

Additional information / screenshots

No response

Mar 03 '22 16:03 njr175

This could be an issue from using a more recent version of tedana. @tsalo could I bug you to take a look?

Mar 03 '22 16:03 effigies

It may also be that we need unwarp_wf to output uncompressed NIfTI files when --low-mem is passed. It's possible that in previous versions the inputs to t2smap were uncompressed.

Mar 03 '22 16:03 effigies

It looks like tedana's memory usage is the problem here. tedana should be able to handle compressed or uncompressed NIfTI files equally well, since it just loads them into numpy arrays with nibabel. tedana does have a --lowmem option that could be passed from fMRIPrep when --low-mem is used, although I don't think it would have an effect on the problematic step here.

Two options I could see are:

Within tedana, we could see if we can improve the memory usage of the curvefit option when --lowmem is used..
Within fMRIPrep, set --fittype to loglin instead of curvefit when --low-mem is used. The log-linear T2* estimation method should be less accurate, but faster and less memory intensive.

Mar 03 '22 16:03 tsalo

Is there anything I can do to help test, next steps, etc.?

Thanks so much for taking a look at this issue - obviously we don't want to have to preprocess an entire dataset one participant at a time!

Mar 07 '22 19:03 njr175

@njr175 I've opened an issue in the tedana repo (https://github.com/ME-ICA/tedana/issues/856), but it will probably be a while before we can propagate any changes to an fMRIPrep release, since we'll need to (1) fix the problem on tedana's side, (2) make a new release, (3) pin the new version in fMRIPrep's requirements, and (4) make a new fMRIPrep release.

If you're able to build the Docker image from a fork, you could manually switch from nonlinear T2* estimation to the log-linear version. Basically, you would just need to switch the order here:

https://github.com/nipreps/fmriprep/blob/32f7b06f148fc6020c5ecedb77fd8dee10a8813f/fmriprep/interfaces/multiecho.py#L64-L73

Instead of 'curvefit', 'loglin', which makes curvefit (the nonlinear approach) the default, you would have 'loglin', 'curvefit' so that the log-linear method is the default.

Mar 10 '22 17:03 tsalo

Thanks for offering a solution while we wait for the fix, @tsalo. Do I need to include one of the flags I was using before ( --low-mem --mem-mb55000 --use-plugin plugin.yml )

I switched the order as you indicated, and tried to re-run fMRIPrep (without any of the above flags), and I got the broken process pool error again. I was testing with 3 individuals.

ERROR File "/opt/conda/lib/python3.8/concurrent/futures/_base.py", line 389, in getresult rasie self.exception concurrent.futures.process.BrokenProcessPool: A process in the process pool was termindated abruptly while the future was running or pending

Mar 21 '22 17:03 njr175

Just wanted to send an update: We still have not been able to get the new version of fMRIPrep to work on more than 1 participant with multi echo data. With the fix you mentioned, I tried with and without the 3 flags - still doesn't work. To double check something didn't happen with our ubuntu server, I ran a handful of participants through an old version of fMRIPrep and they ran without error.

Apr 21 '22 14:04 njr175

I'm sorry for the delay in responding. If you look at the logs, do they show the t2smap command used?

Apr 25 '22 13:04 tsalo

Yesterday I tried fmriprep 22.0.0rc3 on one subject (one session) and got the following error. It seems to me it's relevant but if it's not, please let me know and I delete my comment.

Error message:

Node: fmriprep_22_0_wf.single_subject_s003_wf.func_preproc_ses_3_task_vw_acq_mb4mesl56_run_2_echo_1_wf.bold_t2smap_wf.t2smap_node
Working directory: /work/fmriprep_22_0_wf/single_subject_s003_wf/func_preproc_ses_3_task_vw_acq_mb4mesl56_run_2_echo_1_wf/bold_t2smap_wf/t2smap_node

Node inputs:

args = <undefined>
echo_times = [0.0124, 0.03013, 0.04786, 0.06559]
environ = {}
fittype = curvefit
in_files = <undefined>
mask_file = <undefined>

Traceback (most recent call last):
  File "/opt/conda/lib/python3.9/site-packages/nipype/pipeline/plugins/multiproc.py", line 67, in run_node
    result["result"] = node.run(updatehash=updatehash)
  File "/opt/conda/lib/python3.9/site-packages/nipype/pipeline/engine/nodes.py", line 524, in run
    result = self._run_interface(execute=True)
  File "/opt/conda/lib/python3.9/site-packages/nipype/pipeline/engine/nodes.py", line 642, in _run_interface
    return self._run_command(execute)
  File "/opt/conda/lib/python3.9/site-packages/nipype/pipeline/engine/nodes.py", line 750, in _run_command
    raise NodeExecutionError(
nipype.pipeline.engine.nodes.NodeExecutionError: Exception raised while executing Node t2smap_node.

Traceback (most recent call last):
  File "/opt/conda/lib/python3.9/site-packages/nipype/interfaces/base/core.py", line 454, in aggregate_outputs
    setattr(outputs, key, val)
  File "/opt/conda/lib/python3.9/site-packages/nipype/interfaces/base/traits_extension.py", line 330, in validate
    value = super(File, self).validate(objekt, name, value, return_pathlike=True)
  File "/opt/conda/lib/python3.9/site-packages/nipype/interfaces/base/traits_extension.py", line 135, in validate
    self.error(objekt, name, str(value))
  File "/opt/conda/lib/python3.9/site-packages/traits/base_trait_handler.py", line 74, in error
    raise TraitError(
traits.trait_errors.TraitError: The 't2star_map' trait of a T2SMapOutputSpec instance must be a pathlike object or string representing an existing file, but a value of '/work/fmriprep_22_0_wf/single_subject_s003_wf/func_preproc_ses_3_task_vw_acq_mb4mesl56_run_2_echo_1_wf/bold_t2smap_wf/t2smap_node/T2starmap.nii.gz' <class 'str'> was specified.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.9/site-packages/nipype/interfaces/base/core.py", line 401, in run
    outputs = self.aggregate_outputs(runtime)
  File "/opt/conda/lib/python3.9/site-packages/nipype/interfaces/base/core.py", line 461, in aggregate_outputs
    raise FileNotFoundError(msg)
FileNotFoundError: No such file or directory '/work/fmriprep_22_0_wf/single_subject_s003_wf/func_preproc_ses_3_task_vw_acq_mb4mesl56_run_2_echo_1_wf/bold_t2smap_wf/t2smap_node/T2starmap.nii.gz' for output 't2star_map' of a T2SMap interface

Command I executed:

singularity run --cleanenv --bind /mnt/dbgroup-share/mpawlik/data-ccns/anclang:/data --bind /mnt/dbgroup-share/mpawlik/scratch:/work --bind /home/mpawlik/bids/freesurfer_license.txt:/opt/freesurfer/license.txt fmriprep_22.0.0rc3 /data /data/derivatives/fmriprep participant --nprocs 16 --me-output-echos --fs-no-reconall --notrack --skip_bids_validation -v -w /work

Update: fmriprep 21.0.2 threw numpy.core._exceptions.MemoryError. This was most probably due to too many parallel jobs requesting too much memory.

Here's the error:

Node: fmriprep_wf.single_subject_s003_wf.func_preproc_ses_3_task_vw_acq_mb4mesl56_run_1_echo_1_wf.bold_t2smap_wf.t2smap_node
Working directory: /work/fmriprep_wf/single_subject_s003_wf/func_preproc_ses_3_task_vw_acq_mb4mesl56_run_1_echo_1_wf/bold_t2smap_wf/t2smap_node

Node inputs:

args = <undefined>
echo_times = [0.0124, 0.03013, 0.04786, 0.06559]
environ = {}
fittype = curvefit
in_files = <undefined>
mask_file = <undefined>

Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/site-packages/nipype/pipeline/plugins/multiproc.py", line 67, in run_node
    result["result"] = node.run(updatehash=updatehash)
  File "/opt/conda/lib/python3.8/site-packages/nipype/pipeline/engine/nodes.py", line 516, in run
    result = self._run_interface(execute=True)
  File "/opt/conda/lib/python3.8/site-packages/nipype/pipeline/engine/nodes.py", line 635, in _run_interface
    return self._run_command(execute)
  File "/opt/conda/lib/python3.8/site-packages/nipype/pipeline/engine/nodes.py", line 741, in _run_command
    result = self._interface.run(cwd=outdir)
  File "/opt/conda/lib/python3.8/site-packages/nipype/interfaces/base/core.py", line 428, in run
    runtime = self._run_interface(runtime)
  File "/opt/conda/lib/python3.8/site-packages/nipype/interfaces/base/core.py", line 822, in _run_interface
    self.raise_exception(runtime)
  File "/opt/conda/lib/python3.8/site-packages/nipype/interfaces/base/core.py", line 749, in raise_exception
    raise RuntimeError(
RuntimeError: Command:
t2smap -d /work/fmriprep_wf/single_subject_s003_wf/func_preproc_ses_3_task_vw_acq_mb4mesl56_run_1_echo_1_wf/unwarp_wf/_echoidx_0/merge/vol0000_unwarped_merged.nii.gz /work/fmriprep_wf/single_subject_s003_wf/func_preproc_ses_3_task_vw_acq_mb4mesl56_run_1_echo_1_wf/unwarp_wf/_echoidx_1/merge/vol0000_unwarped_merged.nii.gz /work/fmriprep_wf/single_subject_s003_wf/func_preproc_ses_3_task_vw_acq_mb4mesl56_run_1_echo_1_wf/unwarp_wf/_echoidx_2/merge/vol0000_unwarped_merged.nii.gz /work/fmriprep_wf/single_subject_s003_wf/func_preproc_ses_3_task_vw_acq_mb4mesl56_run_1_echo_1_wf/unwarp_wf/_echoidx_3/merge/vol0000_unwarped_merged.nii.gz -e 12.4 30.13 47.86 65.58999999999999 --mask /work/fmriprep_wf/single_subject_s003_wf/func_preproc_ses_3_task_vw_acq_mb4mesl56_run_1_echo_1_wf/unwarp_wf/brainextraction_wf/_echoidx_0/masker/clipped_mask.nii.gz --fittype curvefit
Standard output:

Standard error:
INFO     t2smap:t2smap_workflow:241 Using output directory: /work/fmriprep_wf/single_subject_s003_wf/func_preproc_ses_3_task_vw_acq_mb4mesl56_run_1_echo_1_wf/bold_t2smap_wf/t2smap_node
INFO     t2smap:t2smap_workflow:251 Loading input data: ['/work/fmriprep_wf/single_subject_s003_wf/func_preproc_ses_3_task_vw_acq_mb4mesl56_run_1_echo_1_wf/unwarp_wf/_echoidx_0/merge/vol0000_unwarped_merged.nii.gz', '/work/fmriprep_wf/single_subject_s003_wf/func_preproc_ses_3_task_vw_acq_mb4mesl56_run_1_echo_1_wf/unwarp_wf/_echoidx_1/merge/vol0000_unwarped_merged.nii.gz', '/work/fmriprep_wf/single_subject_s003_wf/func_preproc_ses_3_task_vw_acq_mb4mesl56_run_1_echo_1_wf/unwarp_wf/_echoidx_2/merge/vol0000_unwarped_merged.nii.gz', '/work/fmriprep_wf/single_subject_s003_wf/func_preproc_ses_3_task_vw_acq_mb4mesl56_run_1_echo_1_wf/unwarp_wf/_echoidx_3/merge/vol0000_unwarped_merged.nii.gz']
INFO     t2smap:t2smap_workflow:267 Using user-defined mask
INFO     t2smap:t2smap_workflow:270 Computing adaptive T2* map
Traceback (most recent call last):
  File "/opt/conda/bin/t2smap", line 8, in <module>
    sys.exit(_main())
  File "/opt/conda/lib/python3.8/site-packages/tedana/workflows/t2smap.py", line 343, in _main
    t2smap_workflow(**kwargs)
  File "/opt/conda/lib/python3.8/site-packages/tedana/workflows/t2smap.py", line 272, in t2smap_workflow
    (t2s_limited, s0_limited, t2s_full, s0_full) = decay.fit_decay(
  File "/opt/conda/lib/python3.8/site-packages/tedana/decay.py", line 372, in fit_decay
    data_masked = data[mask, :, :]
numpy.core._exceptions.MemoryError: Unable to allocate 1.73 GiB for an array with shape (90714, 4, 640) and data type float64
Return code: 1

Jul 05 '22 06:07 mateuszpawlik

Hi @mateuszpawlik @tsalo

Did you ever get this working? I am having the same problem with t2smap but within a subject (many runs per subject) Thanks

Nov 27 '22 20:11 zachladwig

We successfully executed fmriprep 21.0.2. Here's my command:

singularity run --cleanenv --bind /mnt/dbgroup-share/mpawlik/data-ccns/anclang:/data --bind /mnt/dbgroup-share/mpawlik/scratch:/work --bind /home/mpawlik/bids/freesurfer_license.txt:/opt/freesurfer/license.txt fmriprep_21.0.2.sif /data /data/derivatives/fmriprep participant --me-output-echos --nprocs 4 --resource-monitor --fs-no-reconall --notrack --skip-bids-validation --task-id vw -v -w /work

We haven't tried with newer versions.

Nov 28 '22 07:11 mateuszpawlik

So the problem here is that T2SMap is not annotated with memory consumption: https://github.com/nipreps/fmriprep/blob/master/fmriprep/workflows/bold/t2s.py#L101 which means we will assume it will use 100MiB. We can see from @mateuszpawlik's node that we seem to have voxels x echos x time points x 64-bit memory complexity.

@tsalo are there any extra coefficients or multiplicative factors that should be considered?

At the very least, taking the full image set size and throwing in a factor of two will be a better estimate than 100M.

Dec 02 '22 20:12 effigies

I ran a local test and the relevant code (the bit that seemed to be failing) was maxing out at ~2 GB on a 288 MB file with three echoes. The proposed change in #2898 came out to 1.73 GB, so it might need a bit more? The function also uses scipy.optimize.curve_fit, but I have no clue what the memory usage tends to be for that, sorry.

Dec 02 '22 21:12 tsalo

Okay, a factor of 2.5 would get pretty close to 2gb.

Dec 02 '22 23:12 effigies