fmriprep OSError: [Errno 39] Directory not empty: '/zi/home/johannes.wiesner/work/slurm/project_newmeds/output/fmriprep/sourcedata/freesurfer/fsaverage/label'

What happened?

I am running fmriprep 25.1.3 as a slurm job for each of my participants. For some of those participants (but not for all) I get this error:

OSError: [Errno 39] Directory not empty: '/zi/home/johannes.wiesner/work/slurm/project_newmeds/output/fmriprep/sourcedata/freesurfer/fsaverage/label'

Both the working directory and the output directory were empty before submitting all jobs.

Looks like this is related to:

https://github.com/nipreps/fmriprep/issues/3332

I am using fmriprep as a linux environment module and it seems that on my machine the command

fmriprep

is an alias for:

singularity exec /zi/apps/container/fmriprep_25.1.3.sif "${0##*/}" "$@"

Just reading through the commands from other threads, it might help to ask admins to also add the --containall flag?

What command did you use?

sbatch \
 --cpus-per-task 8 \
 --mem 24G \
 --wrap "module load fmriprep/25.1.3 && \
  fmriprep /path/to/rawdata/ /path/to/output participant \
 --fs-license-file /path/to/freesurfer_license.txt \
 --n_cpus 8 \
 --skip-bids-validation  \
 --work-dir /path/to/workdir \
 --participant_label NM01'"

What version of fMRIPrep are you running?

25.1.3

How are you running fMRIPrep?

Singularity

Is your data BIDS valid?

Yes

Are you reusing any previously computed results?

No

Please copy and paste any relevant log output.

You are using fMRIPrep-25.1.3, and a newer version of fMRIPrep is available: 25.1.4.
Please check out our documentation about how and when to upgrade:
https://fmriprep.readthedocs.io/en/latest/faq.html#upgrading
250805-11:27:01,484 nipype.workflow IMPORTANT:

[.....]

250805-11:27:34,540 nipype.workflow IMPORTANT:
	 fMRIPrep started!
250805-11:27:36,301 nipype.workflow WARNING:
	 Storing result file without outputs
250805-11:27:36,314 nipype.workflow WARNING:
	 [Node] Error on "fmriprep_25_1_wf.fsdir_run_20250805_112635_a003560a_4143_40af_abc0_4e768c4f0fcd" (/zi/home/johannes.wiesner/work/slurm/project_newmeds/workdir/fmriprep/fmriprep_25_1_wf/fsdir_run_20250805_112635_a003560a_4143_40af_abc0_4e768c4f0fcd)
250805-11:27:36,327 nipype.workflow ERROR:
	 Node fsdir_run_20250805_112635_a003560a_4143_40af_abc0_4e768c4f0fcd failed to run on host zislhcn0112.zi.local.
250805-11:27:36,333 nipype.workflow ERROR:
	 Saving crash info to /zi/home/johannes.wiesner/work/slurm/project_newmeds/output/fmriprep/logs/crash-20250805-112736-johannes.wiesner-fsdir_run_20250805_112635_a003560a_4143_40af_abc0_4e768c4f0fcd-077b034a-1b7b-4ff0-9210-fdc3f4dc33d5.txt
Traceback (most recent call last):
  File "/opt/conda/envs/fmriprep/lib/python3.12/site-packages/nipype/pipeline/plugins/multiproc.py", line 389, in _send_procs_to_workers
    self.procs[jobid].run(updatehash=updatehash)
  File "/opt/conda/envs/fmriprep/lib/python3.12/site-packages/nipype/pipeline/engine/nodes.py", line 525, in run
    result = self._run_interface(execute=True)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/fmriprep/lib/python3.12/site-packages/nipype/pipeline/engine/nodes.py", line 643, in _run_interface
    return self._run_command(execute)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/fmriprep/lib/python3.12/site-packages/nipype/pipeline/engine/nodes.py", line 769, in _run_command
    raise NodeExecutionError(msg)
nipype.pipeline.engine.nodes.NodeExecutionError: Exception raised while executing Node fsdir_run_20250805_112635_a003560a_4143_40af_abc0_4e768c4f0fcd.

Traceback:
	Traceback (most recent call last):
	  File "/opt/conda/envs/fmriprep/lib/python3.12/site-packages/nipype/interfaces/base/core.py", line 401, in run
	    runtime = self._run_interface(runtime)
	              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	  File "/opt/conda/envs/fmriprep/lib/python3.12/site-packages/niworkflows/interfaces/bids.py", line 1429, in _run_interface
	    shutil.rmtree(dest)
	  File "/opt/conda/envs/fmriprep/lib/python3.12/shutil.py", line 759, in rmtree
	    _rmtree_safe_fd(stack, onexc)
	  File "/opt/conda/envs/fmriprep/lib/python3.12/shutil.py", line 703, in _rmtree_safe_fd
	    onexc(func, path, err)
	  File "/opt/conda/envs/fmriprep/lib/python3.12/shutil.py", line 662, in _rmtree_safe_fd
	    os.rmdir(name, dir_fd=dirfd)
	OSError: [Errno 39] Directory not empty: '/zi/home/johannes.wiesner/work/slurm/project_newmeds/output/fmriprep/sourcedata/freesurfer/fsaverage/label'


250805-11:27:45,319 nipype.workflow INFO:
	 [Node] Setting-up "fmriprep_25_1_wf.sub_NMAR17_wf.anat_fit_wf.brain_extraction_wf.full_wm" in "/zi/home/johannes.wiesner/work/slurm/project_newmeds/workdir/fmriprep/fmriprep_25_1_wf/sub_NMAR17_wf/anat_fit_wf/brain_extraction_wf/full_wm".

[......]

250805-11:54:06,350 nipype.workflow IMPORTANT:
	 fMRIPrep finished successfully!
250805-11:54:06,361 nipype.workflow IMPORTANT:
	 Works derived from this fMRIPrep execution should include the boilerplate text found in <OUTPUT_PATH>/logs/CITATION.md.

Additional information / screenshots

Also surprised me to see the

fMRIPrep finished successfully!

output at the end, apparently the error did not get catched properly?

Aug 05 '25 10:08 JohannesWiesner

looks like a race condition when copying fsaverage to the freesurfer output directory - if you run a single process and let it complete, future runs shouldn't have a problem

Aug 05 '25 12:08 mgxd

@mgxd

Yes, that's one of my hypotheses too (but I can't prove it, see : https://github.com/nipreps/fmriprep/issues/3258#issuecomment-2032118214). In the past, it worked well (I think?) when I first processed one subject and waited until it was finished, and then submitted all the other subjects.

But this would mean I have to wait first 10 hrs and then submit all the other jobs.

Is there a solution to this problem? Because fmriprep in theory is made for parallel processing of subjects?

Aug 05 '25 12:08 JohannesWiesner

It runs in the first minute or so of processing. Starting 1 job, waiting for it to run for a bit, and then starting the rest would work. The problem is when you have a queue where you're waiting a long time for your first job to start.

You could do something like the following prior to submitting:

singularity exec [OPTIONS] cp -r /opt/freesurfer/subjects/fsaverage $OUTPUT/sourcedata/freesurfer/fsaverage

Alternately, you could use a separate output directory for each subject to avoid the race, and then merge the results.

Aug 05 '25 12:08 effigies

@effigies :

The problem is when you have a queue where you're waiting a long time for your first job to start.

Can confirm this. On my cluster, usually all jobs pend for 1-2 minutes and then all start more or less at the same time.

singularity exec [OPTIONS] cp -r /opt/freesurfer/subjects/fsaverage $OUTPUT/sourcedata/freesurfer/fsaverage

What exactly is this doing?

Alternately, you could use a separate output directory for each subject to avoid the race, and then merge the results.

Ah, interesting, I thought this was considered to be bad practice.

Aug 05 '25 12:08 JohannesWiesner

singularity exec [OPTIONS] cp -r /opt/freesurfer/subjects/fsaverage $OUTPUT/sourcedata/freesurfer/fsaverage

What exactly is this doing?

Copying the fsaverage from inside the container to where it will be in the output directory. It's what fMRIPrep does internally, so if you do it before running, you avoid the race because all processes will be happy with what they find.

Aug 05 '25 12:08 effigies

singularity exec [OPTIONS] cp -r /opt/freesurfer/subjects/fsaverage $OUTPUT/sourcedata/freesurfer/fsaverage

What exactly is this doing?

Copying the fsaverage from inside the container to where it will be in the output directory. It's what fMRIPrep does internally, so if you do it before running, you avoid the race because all processes will be happy with what they find.

Could we chain the commands so everything happens in one go when the user calls fmriprep on their side?

Like such?

singularity exec /zi/apps/container/fmriprep_25.1.3.sif cp -r /opt/freesurfer/subjects/fsaverage $OUTPUT/sourcedata/freesurfer/fsaverage && \
singularity exec /zi/apps/container/fmriprep_25.1.3.sif "${0##*/}" "$@"

Aug 05 '25 13:08 JohannesWiesner

Wait, shouldn't the cp command also check if the folder already exists? Like such:

DEST="$OUTPUT/sourcedata/freesurfer/fsaverage"
if [ ! -d "$DEST" ]; then
  mkdir -p "$(dirname "$DEST")"
  singularity exec /zi/apps/container/fmriprep_25.1.3.sif \
    cp -r /opt/freesurfer/subjects/fsaverage "$DEST"
fi && \
singularity exec /zi/apps/container/fmriprep_25.1.3.sif "${0##*/}" "$@"

which should translate to:

1.) Check whether a folder named sourcedata/freesurfer/fsaverage already exists in the user-defined output directory. 2.) If not, copy this folder from the container to the destination (this will only be necessary for the first job). 3.) If the folder already exists at the destination, do nothing. 4.) Start fmriprep. 5.) fmriprep also checks internally whether the folder exists and will skip the step (?)

Aug 05 '25 13:08 JohannesWiesner

Ah wait again, then I would end up with the same problem right, because two jobs that start at the same time would both try to create the directory and copy the contents?

Aug 05 '25 13:08 JohannesWiesner

@effigies :

Alternately, you could use a separate output directory for each subject to avoid the race, and then merge the results.

So I did that by using the following (pseudo) code:

sbatch \
 --wrap "bash -c 'mkdir /path/to/fmriprep_output_tmp/\${SLURM_JOB_ID}/ && \
  module load fmriprep/25.1.3 && \
  fmriprep /path/to/rawdata /path/to/fmriprep_output_tmp/\${SLURM_JOB_ID}/ participant \
 --work-dir /path/to/workdir \
 --participant_label NM01 && \
  while ! rsync -av --ignore-existing /path/to/fmriprep_output_tmp/\${SLURM_JOB_ID}/ /path/to/final/output/;do sleep 5;done'"

That creates a separate output directory for each subject, runs fmriprep for that subject, and uses rsync to sync the subject-wise output to a final output directory.

With that approach, I did not encounter the error above, which is evidence for the "racing hypothesis" I guess.

That still leaves me with this question. Shouldn't fmriprep, in theory, be designed to be protected against these issues, as the idea is to run subjects independently and in parallel? With the current design all jobs depend somewhat on the first job.

Aug 06 '25 09:08 JohannesWiesner

@JohannesWiesner if you are running fmriprep as a single process across subjects, it is protected. however, if multiple processes are running independently, this introduces some complexity (lock files, handling stale locks, etc). if you have any ideas or want to contribute, happy to review a pull request 😄

Aug 06 '25 16:08 mgxd

I very likely underestimate the complexity here, but wouldn't it be sufficient to give each subject its own /sourcedata/freesurfer/ directory?

Aug 08 '25 13:08 JohannesWiesner

I also had this issue. I'm using docker and was able to resolve by mapping a complete fsaverage directory into the container.

docker run --rm \
    -v "${datadir}:/data:ro" \
    -v "${outdir}:/out" \
    -v "${fsavgdir}:/out/sourcedata/freesurfer/fsaverage:ro" \
    -v "${fs_license}:/opt/freesurfer/license.txt:ro" \
    nipreps/fmriprep:25.2.3 \
    /data /out participant \
    --participant-label $subid \
    --fs-license-file /opt/freesurfer/license.txt

Oct 29 '25 13:10 clane9