qsiprep
qsiprep copied to clipboard
Preprocessing stuck on hstack_confounds
I am testing out a new compute platform and I keep running into the same error in preprocessing in which the workflow stalls during hstack_confounds
. The slurm job continues to run until the hard time limit, but no output is produced for the dwi in bids/derivatives/qsiprep - only anatomical preprocessing outputs.
This issue pops up when I use a Singularity container for QSIPrep v0.19.0 or one for QSIPrep v0.19.1; both using the same $SINGULARITY_TMPDIR
as the working directory. The command is as follows:
SINGULARITY_CACHEDIR=${scachedir} SINGULARITY_TMPDIR=${stmpdir} singularity run --cleanenv --bind ${IMAGEDIR}:/imgdir,${stmpdir}:/sing_scratch,${projDir}:/data \
${IMAGEDIR}/qsiprep-v0.19.1.sif \
--fs-license-file /imgdir/license.txt /data/bids/sourcedata /data/bids/derivatives \
--output-resolution 2.5 --denoise-method patch2self --resource-monitor -w /sing_scratch \
--nthreads 24 --omp-nthreads 8 --low-mem --longitudinal --separate_all_dwis \
--bids-filter-file /data/bids/derivatives/qsiprep/config_A.json \
-vv --notrack --write-graph --pepolar-method TOPUP \
participant --participant-label ${subject}
The error is first thrown here:
**[Node] Setting-up "qsiprep_wf.single_subject_SAY542_wf.dwi_preproc_ses_A_run_1_wf.pre_hmc_wf.merge_and_denoise_wf.dwi_denoise_ses_A_run_1_dwi_wf.hstack_confounds" in "/sing_scratch/qsiprep_wf/single_subje>
231006-09:52:35,859 nipype.workflow INFO:
[Node] Executing "hstack_confounds" <qsiprep.interfaces.dwi_merge.StackConfounds>
231006-09:52:35,859 nipype.workflow WARNING:
[Node] Error on "qsiprep_wf.single_subject_SAY542_wf.dwi_preproc_ses_A_run_1_wf.pre_hmc_wf.merge_and_denoise_wf.dwi_denoise_ses_A_run_1_dwi_wf.hstack_confounds" (/sing_scratch/qsiprep_wf/single_subject_SA>
exception calling callback for <Future at 0x2b916bc90a00 state=finished raised FileNotFoundError>
concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/plugins/multiproc.py", line 67, in run_node
result["result"] = node.run(updatehash=updatehash)
File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/engine/nodes.py", line 527, in run
result = self._run_interface(execute=True)
File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/engine/nodes.py", line 645, in _run_interface
return self._run_command(execute)
File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/engine/nodes.py", line 722, in _run_command
result = self._interface.run(cwd=outdir, ignore_exception=True)
File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/interfaces/base/core.py", line 388, in run
self._check_mandatory_inputs()
File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/interfaces/base/core.py", line 275, in _check_mandatory_inputs
raise ValueError(msg)
ValueError: StackConfounds requires a value for input 'in_files'. For a list of required inputs, see StackConfounds.help()
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/miniconda/lib/python3.8/concurrent/futures/process.py", line 239, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/plugins/multiproc.py", line 70, in run_node
result["result"] = node.result
File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/engine/nodes.py", line 223, in result
return _load_resultfile(
File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/engine/utils.py", line 291, in load_resultfile
raise FileNotFoundError(results_file)
FileNotFoundError: /sing_scratch/qsiprep_wf/single_subject_SAY542_wf/dwi_preproc_ses_A_run_1_wf/pre_hmc_wf/merge_and_denoise_wf/dwi_denoise_ses_A_run_1_dwi_wf/hstack_confounds/result_hstack_confounds.pklz
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/miniconda/lib/python3.8/concurrent/futures/_base.py", line 328, in _invoke_callbacks
callback(self)
File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/plugins/multiproc.py", line 159, in _async_callback
result = args.result()
File "/usr/local/miniconda/lib/python3.8/concurrent/futures/_base.py", line 437, in result
return _load_resultfile(
File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/engine/utils.py", line 291, in load_resultfile
raise FileNotFoundError(results_file)
FileNotFoundError: /sing_scratch/qsiprep_wf/single_subject_SAY542_wf/dwi_preproc_ses_A_run_1_wf/pre_hmc_wf/merge_and_denoise_wf/dwi_denoise_ses_A_run_1_dwi_wf/hstack_confounds/result_hstack_confounds.pklz
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/miniconda/lib/python3.8/concurrent/futures/_base.py", line 328, in _invoke_callbacks
callback(self)
File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/plugins/multiproc.py", line 159, in _async_callback
result = args.result()
File "/usr/local/miniconda/lib/python3.8/concurrent/futures/_base.py", line 437, in result
return self.__get_result()
File "/usr/local/miniconda/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
raise self._exception
FileNotFoundError: /sing_scratch/qsiprep_wf/single_subject_SAY542_wf/dwi_preproc_ses_A_run_1_wf/pre_hmc_wf/merge_and_denoise_wf/dwi_denoise_ses_A_run_1_dwi_wf/hstack_confounds/result_hstack_confounds.pklz
231006-09:52:37,505 nipype.workflow INFO:
[Job 25] Completed (qsiprep_wf.single_subject_SAY542_wf.dwi_preproc_ses_A_run_1_wf.pre_hmc_wf.merge_and_denoise_wf.conformed_images).
231006-09:52:37,506 nipype.workflow INFO:
[MultiProc] Running 3 tasks, and 0 jobs ready. Free memory (GB): 452.62/453.22, Free processors: 7/24.
Currently running:
* qsiprep_wf.single_subject_SAY542_wf.dwi_preproc_ses_A_run_1_wf.pre_hmc_wf.merge_and_denoise_wf.dwi_denoise_ses_A_run_1_dwi_wf.hstack_confounds
* qsiprep_wf.single_subject_SAY542_wf.anat_preproc_wf.anat_normalization_wf.anat_nlin_normalization
* qsiprep_wf.single_subject_SAY542_wf.anat_preproc_wf.synthseg_anat_wf.synthseg
231006-09:57:19,190 nipype.workflow INFO:
**
And the hstack_confounds
is eventually the sole process running until the job times out. The file path exists for /qsiprep_wf/single_subject_SAY542_wf/dwi_preproc_ses_A_run_1_wf/pre_hmc_wf/merge_and_denoise_wf/dwi_denoise_ses_A_run_1_dwi_wf/hstack_confounds/
, but it only contains the following:
$ ls qsiprep_wf/single_subject_SAY277_wf/dwi_preproc_ses_A_run_1_wf/pre_hmc_wf/merge_and_denoise_wf/dwi_denoise_ses_A_run_1_dwi_wf/hstack_confounds/
_inputs.pklz _node.pklz _report
where the report.rst reads:
Node: single_subject_SAY277_wf (dwi_preproc_ses_A_run_1_wf (pre_hmc_wf (merge_and_denoise_wf (dwi_denoise_ses_A_run_1_dwi_wf (hstack_confounds (dwi_merge)
==========================================================================================================================================================
Hierarchy : qsiprep_wf.single_subject_SAY277_wf.dwi_preproc_ses_A_run_1_wf.pre_hmc_wf.merge_and_denoise_wf.dwi_denoise_ses_A_run_1_dwi_wf.hstack_confounds
Exec ID : hstack_confounds
Original Inputs
---------------
* axis : 1
* in_files : <undefined>
* out_file : <undefined>
This occurs with or without --low-mem
. Has anyone else ran into this on an HPC?
This looks like something that has been happening a lot recently - a subprocess is killed and the rest of the pipeline hangs indefinitely. How much memory are you allowed to access on the compute host? You need quite a bit for synthseg and synthstrip
To get a faster node allocation, I try to stay at or below 256G, but I have tried this with up to 500G of memory requested for the slurm job. The same behavior occurs with 16, 24, or 32 cpu threads allocated. I think synthseg and synthstrip are able to complete because there are outputs for the anat workflow:
$ ls bids/derivatives/qsiprep/sub-SAY542/anat
sub-SAY542_desc-aseg_dseg.nii.gz sub-SAY542_desc-preproc_T1w.nii.gz sub-SAY542_from-MNI152NLin2009cAsym_to-T1w_mode-image_xfm.h5 sub-SAY542_from-T1wNative_to-T1wACPC_mode-image_xfm.mat
sub-SAY542_desc-brain_mask.nii.gz sub-SAY542_dseg.nii.gz sub-SAY542_from-T1wACPC_to-T1wNative_mode-image_xfm.mat sub-SAY542_from-T1w_to-MNI152NLin2009cAsym_mode-image_xfm.h5
Here are the relevant contents of the working directory:
$ ls qsiprep_wf/*wf/anat*
acpc_aseg_to_dseg anat_normalization_wf anat_template_wf output_grid_wf rigid_acpc_resample_aseg rigid_acpc_resample_head seg2msks synthseg_anat_wf
anat_derivatives_wf anat_reports_wf get_template_image pad_anat_reference_wf rigid_acpc_resample_brain rigid_acpc_resample_mask seg_rpt synthstrip_anat_wf
$ ls qsiprep_wf/*wf/anat*/syn*
qsiprep_wf/single_subject_SAY542_wf/anat_preproc_wf/synthseg_anat_wf:
synthseg
qsiprep_wf/single_subject_SAY542_wf/anat_preproc_wf/synthstrip_anat_wf:
mask_brain mask_to_original_grid synthstrip
$ ls qsiprep_wf/*wf/anat*/syn*/*
qsiprep_wf/single_subject_SAY542_wf/anat_preproc_wf/synthseg_anat_wf/synthseg:
_0x5c74e38b43160dd058d8e04fcb3e75e9.json _node.pklz command.txt sub-SAY542_ses-A_T1w_lps_trans_aseg.nii.gz sub-SAY542_ses-A_T1w_lps_trans_qc.csv
_inputs.pklz _report result_synthseg.pklz sub-SAY542_ses-A_T1w_lps_trans_post.nii.gz
qsiprep_wf/single_subject_SAY542_wf/anat_preproc_wf/synthstrip_anat_wf/mask_brain:
_0xf9b00d8b4bee332bf0b3f022a4f02d7e.json _inputs.pklz _node.pklz _report command.txt masked_brain.nii result_mask_brain.pklz
qsiprep_wf/single_subject_SAY542_wf/anat_preproc_wf/synthstrip_anat_wf/mask_to_original_grid:
_0x320567bc954b0aa201fc9acfb1fa223c.json _inputs.pklz _node.pklz _report command.txt result_mask_to_original_grid.pklz sub-SAY542_ses-A_T1w_lps_trans_mask_trans.nii.gz
ses-A/qsiprep_wf/single_subject_SAY542_wf/anat_preproc_wf/synthstrip_anat_wf/synthstrip:
_0xdfcab5356c161dc2eafc87ec97d1da5b.json _inputs.pklz _node.pklz _report command.txt result_synthstrip.pklz sub-SAY542_ses-A_T1w_lps_trans_brain.nii.gz sub-SAY542_ses-A_T1w_lps_trans_mask.nii.gz
The only errors that I see in the terminal output are for hstack_confounds
- does this depend on synthseg or synthstrip outputs?
I will try to run with dwidenoise
instead of patch2self
to see if the workflow can get past dwi_merge
. Any other suggestions would be greatly appreciated! Thanks!
Using dwidenoise
has resolved the stuck process in hstack_confounds
:
[Node] Setting-up "qsiprep_wf.single_subject_SAY277_wf.dwi_preproc_ses_A_run_1_wf.pre_hmc_wf.merge_and_denoise_wf.dwi_denoise_ses_A_run_1_dwi_wf.hstack_confounds" in "/sing_scratch/qsiprep_wf/single_subje>
231006-15:01:34,587 nipype.workflow INFO:
[Node] Executing "hstack_confounds" <qsiprep.interfaces.dwi_merge.StackConfounds>
231006-15:01:34,594 nipype.workflow INFO:
[Node] Finished "hstack_confounds", elapsed time 0.006104s.
The problem seems to be related to using patch2self
denoising, which seems to complete based on the terminal output but perhaps does not fully connect to the in_files
for hstack_confounds
?
I totally missed that you're using patch2self. The wrapper for it in qsiprep could use a refactor, if you're using a lot of threads it can take up an incredible amount of memory