OSError: [Errno 39] Directory not empty: '/zi/home/johannes.wiesner/work/slurm/project_newmeds/output/fmriprep/sourcedata/freesurfer/fsaverage/label'
What happened?
I am running fmriprep 25.1.3 as a slurm job for each of my participants. For some of those participants (but not for all) I get this error:
OSError: [Errno 39] Directory not empty: '/zi/home/johannes.wiesner/work/slurm/project_newmeds/output/fmriprep/sourcedata/freesurfer/fsaverage/label'
Both the working directory and the output directory were empty before submitting all jobs.
Looks like this is related to:
https://github.com/nipreps/fmriprep/issues/3332
I am using fmriprep as a linux environment module and it seems that on my machine the command
fmriprep
is an alias for:
singularity exec /zi/apps/container/fmriprep_25.1.3.sif "${0##*/}" "$@"
Just reading through the commands from other threads, it might help to ask admins to also add the --containall flag?
What command did you use?
sbatch \
--cpus-per-task 8 \
--mem 24G \
--wrap "module load fmriprep/25.1.3 && \
fmriprep /path/to/rawdata/ /path/to/output participant \
--fs-license-file /path/to/freesurfer_license.txt \
--n_cpus 8 \
--skip-bids-validation \
--work-dir /path/to/workdir \
--participant_label NM01'"
What version of fMRIPrep are you running?
25.1.3
How are you running fMRIPrep?
Singularity
Is your data BIDS valid?
Yes
Are you reusing any previously computed results?
No
Please copy and paste any relevant log output.
You are using fMRIPrep-25.1.3, and a newer version of fMRIPrep is available: 25.1.4.
Please check out our documentation about how and when to upgrade:
https://fmriprep.readthedocs.io/en/latest/faq.html#upgrading
250805-11:27:01,484 nipype.workflow IMPORTANT:
[.....]
250805-11:27:34,540 nipype.workflow IMPORTANT:
fMRIPrep started!
250805-11:27:36,301 nipype.workflow WARNING:
Storing result file without outputs
250805-11:27:36,314 nipype.workflow WARNING:
[Node] Error on "fmriprep_25_1_wf.fsdir_run_20250805_112635_a003560a_4143_40af_abc0_4e768c4f0fcd" (/zi/home/johannes.wiesner/work/slurm/project_newmeds/workdir/fmriprep/fmriprep_25_1_wf/fsdir_run_20250805_112635_a003560a_4143_40af_abc0_4e768c4f0fcd)
250805-11:27:36,327 nipype.workflow ERROR:
Node fsdir_run_20250805_112635_a003560a_4143_40af_abc0_4e768c4f0fcd failed to run on host zislhcn0112.zi.local.
250805-11:27:36,333 nipype.workflow ERROR:
Saving crash info to /zi/home/johannes.wiesner/work/slurm/project_newmeds/output/fmriprep/logs/crash-20250805-112736-johannes.wiesner-fsdir_run_20250805_112635_a003560a_4143_40af_abc0_4e768c4f0fcd-077b034a-1b7b-4ff0-9210-fdc3f4dc33d5.txt
Traceback (most recent call last):
File "/opt/conda/envs/fmriprep/lib/python3.12/site-packages/nipype/pipeline/plugins/multiproc.py", line 389, in _send_procs_to_workers
self.procs[jobid].run(updatehash=updatehash)
File "/opt/conda/envs/fmriprep/lib/python3.12/site-packages/nipype/pipeline/engine/nodes.py", line 525, in run
result = self._run_interface(execute=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/fmriprep/lib/python3.12/site-packages/nipype/pipeline/engine/nodes.py", line 643, in _run_interface
return self._run_command(execute)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/fmriprep/lib/python3.12/site-packages/nipype/pipeline/engine/nodes.py", line 769, in _run_command
raise NodeExecutionError(msg)
nipype.pipeline.engine.nodes.NodeExecutionError: Exception raised while executing Node fsdir_run_20250805_112635_a003560a_4143_40af_abc0_4e768c4f0fcd.
Traceback:
Traceback (most recent call last):
File "/opt/conda/envs/fmriprep/lib/python3.12/site-packages/nipype/interfaces/base/core.py", line 401, in run
runtime = self._run_interface(runtime)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/fmriprep/lib/python3.12/site-packages/niworkflows/interfaces/bids.py", line 1429, in _run_interface
shutil.rmtree(dest)
File "/opt/conda/envs/fmriprep/lib/python3.12/shutil.py", line 759, in rmtree
_rmtree_safe_fd(stack, onexc)
File "/opt/conda/envs/fmriprep/lib/python3.12/shutil.py", line 703, in _rmtree_safe_fd
onexc(func, path, err)
File "/opt/conda/envs/fmriprep/lib/python3.12/shutil.py", line 662, in _rmtree_safe_fd
os.rmdir(name, dir_fd=dirfd)
OSError: [Errno 39] Directory not empty: '/zi/home/johannes.wiesner/work/slurm/project_newmeds/output/fmriprep/sourcedata/freesurfer/fsaverage/label'
250805-11:27:45,319 nipype.workflow INFO:
[Node] Setting-up "fmriprep_25_1_wf.sub_NMAR17_wf.anat_fit_wf.brain_extraction_wf.full_wm" in "/zi/home/johannes.wiesner/work/slurm/project_newmeds/workdir/fmriprep/fmriprep_25_1_wf/sub_NMAR17_wf/anat_fit_wf/brain_extraction_wf/full_wm".
[......]
250805-11:54:06,350 nipype.workflow IMPORTANT:
fMRIPrep finished successfully!
250805-11:54:06,361 nipype.workflow IMPORTANT:
Works derived from this fMRIPrep execution should include the boilerplate text found in <OUTPUT_PATH>/logs/CITATION.md.
Additional information / screenshots
Also surprised me to see the
fMRIPrep finished successfully!
output at the end, apparently the error did not get catched properly?
looks like a race condition when copying fsaverage to the freesurfer output directory - if you run a single process and let it complete, future runs shouldn't have a problem
@mgxd
Yes, that's one of my hypotheses too (but I can't prove it, see : https://github.com/nipreps/fmriprep/issues/3258#issuecomment-2032118214). In the past, it worked well (I think?) when I first processed one subject and waited until it was finished, and then submitted all the other subjects.
But this would mean I have to wait first 10 hrs and then submit all the other jobs.
Is there a solution to this problem? Because fmriprep in theory is made for parallel processing of subjects?
It runs in the first minute or so of processing. Starting 1 job, waiting for it to run for a bit, and then starting the rest would work. The problem is when you have a queue where you're waiting a long time for your first job to start.
You could do something like the following prior to submitting:
singularity exec [OPTIONS] cp -r /opt/freesurfer/subjects/fsaverage $OUTPUT/sourcedata/freesurfer/fsaverage
Alternately, you could use a separate output directory for each subject to avoid the race, and then merge the results.
@effigies :
The problem is when you have a queue where you're waiting a long time for your first job to start.
Can confirm this. On my cluster, usually all jobs pend for 1-2 minutes and then all start more or less at the same time.
singularity exec [OPTIONS] cp -r /opt/freesurfer/subjects/fsaverage $OUTPUT/sourcedata/freesurfer/fsaverage
What exactly is this doing?
Alternately, you could use a separate output directory for each subject to avoid the race, and then merge the results.
Ah, interesting, I thought this was considered to be bad practice.
singularity exec [OPTIONS] cp -r /opt/freesurfer/subjects/fsaverage $OUTPUT/sourcedata/freesurfer/fsaverage
What exactly is this doing?
Copying the fsaverage from inside the container to where it will be in the output directory. It's what fMRIPrep does internally, so if you do it before running, you avoid the race because all processes will be happy with what they find.
singularity exec [OPTIONS] cp -r /opt/freesurfer/subjects/fsaverage $OUTPUT/sourcedata/freesurfer/fsaverage
What exactly is this doing?
Copying the fsaverage from inside the container to where it will be in the output directory. It's what fMRIPrep does internally, so if you do it before running, you avoid the race because all processes will be happy with what they find.
Could we chain the commands so everything happens in one go when the user calls fmriprep on their side?
Like such?
singularity exec /zi/apps/container/fmriprep_25.1.3.sif cp -r /opt/freesurfer/subjects/fsaverage $OUTPUT/sourcedata/freesurfer/fsaverage && \
singularity exec /zi/apps/container/fmriprep_25.1.3.sif "${0##*/}" "$@"
Wait, shouldn't the cp command also check if the folder already exists? Like such:
DEST="$OUTPUT/sourcedata/freesurfer/fsaverage"
if [ ! -d "$DEST" ]; then
mkdir -p "$(dirname "$DEST")"
singularity exec /zi/apps/container/fmriprep_25.1.3.sif \
cp -r /opt/freesurfer/subjects/fsaverage "$DEST"
fi && \
singularity exec /zi/apps/container/fmriprep_25.1.3.sif "${0##*/}" "$@"
which should translate to:
1.) Check whether a folder named sourcedata/freesurfer/fsaverage already exists in the user-defined output directory.
2.) If not, copy this folder from the container to the destination (this will only be necessary for the first job).
3.) If the folder already exists at the destination, do nothing.
4.) Start fmriprep.
5.) fmriprep also checks internally whether the folder exists and will skip the step (?)
Ah wait again, then I would end up with the same problem right, because two jobs that start at the same time would both try to create the directory and copy the contents?
@effigies :
Alternately, you could use a separate output directory for each subject to avoid the race, and then merge the results.
So I did that by using the following (pseudo) code:
sbatch \
--wrap "bash -c 'mkdir /path/to/fmriprep_output_tmp/\${SLURM_JOB_ID}/ && \
module load fmriprep/25.1.3 && \
fmriprep /path/to/rawdata /path/to/fmriprep_output_tmp/\${SLURM_JOB_ID}/ participant \
--work-dir /path/to/workdir \
--participant_label NM01 && \
while ! rsync -av --ignore-existing /path/to/fmriprep_output_tmp/\${SLURM_JOB_ID}/ /path/to/final/output/;do sleep 5;done'"
That creates a separate output directory for each subject, runs fmriprep for that subject, and uses rsync to sync the subject-wise output to a final output directory.
With that approach, I did not encounter the error above, which is evidence for the "racing hypothesis" I guess.
That still leaves me with this question. Shouldn't fmriprep, in theory, be designed to be protected against these issues, as the idea is to run subjects independently and in parallel? With the current design all jobs depend somewhat on the first job.
@JohannesWiesner if you are running fmriprep as a single process across subjects, it is protected. however, if multiple processes are running independently, this introduces some complexity (lock files, handling stale locks, etc). if you have any ideas or want to contribute, happy to review a pull request 😄
I very likely underestimate the complexity here, but wouldn't it be sufficient to give each subject its own /sourcedata/freesurfer/ directory?
I also had this issue. I'm using docker and was able to resolve by mapping a complete fsaverage directory into the container.
docker run --rm \
-v "${datadir}:/data:ro" \
-v "${outdir}:/out" \
-v "${fsavgdir}:/out/sourcedata/freesurfer/fsaverage:ro" \
-v "${fs_license}:/opt/freesurfer/license.txt:ro" \
nipreps/fmriprep:25.2.3 \
/data /out participant \
--participant-label $subid \
--fs-license-file /opt/freesurfer/license.txt