fmriprep icon indicating copy to clipboard operation
fmriprep copied to clipboard

Changing order of columns in confounds tsv when trying to generate identical output

Open mharms opened this issue 4 months ago • 1 comments

What happened?

We ran a test of our ability to generate identical (deterministic) output when using --skull-strip-fixed-seed + --omp-nthreads 1 + a matched --random-seed value. Discounting for date/time-of-execution information in the .gii files, and presumably the xfm.h5 files as well, the results were indeed "identical", except with the following oddities:

  1. For some reason, the anat/*_desc-ribbon_mask.nii.gz file was the only .nii.gz file that was not binary identical (i.e., did not have identical MD5 hash). Comparison of the two files using FreeSurfer's mri_diff --no-exit-on-diff reported no differences in either the NIFTI header or intensity values.
  2. The func/*_desc-confounds_timeseries.tsv files differed per diff. Closer examination revealed that the cause of this difference was that the order of the <var>_derivative1_power2 and <var>_power2 variables was swapped for several of the variables in the tsv. (csf, white_matter, rot_x and rot_z, in one comparison). Within a given labelled column, the values were the same between the two executions.

Just mentioning this here for documentation purposes. Not sure what could be the source of (1). Perhaps (2) is a race condition of some sort in the manner in which the column order in the tsv files gets set?

What command did you use?

/opt/conda/envs/fmriprep/bin/fmriprep /data /out participant --participant-label sub-MA15250 --bids-filter-file /work/bids_filter_ses-20230815.json --fs-license-file /fs/license.txt --fs-subjects-dir /fs_subjects --work-dir /work --nthreads 1 --omp-nthreads 1 --mem_mb 24000 --output-spaces T1w MNI152NLin6Asym fsaverage6 --skip-bids-validation --return-all-components --cifti-output 91k --project-goodvoxels --ignore slicetiming --force bbr --fs-no-resume --notrack --stop-on-first-crash --write-graph --skull-strip-fixed-seed --random-seed 40939 --verbose

What version of fMRIPrep are you running?

25.1.1

How are you running fMRIPrep?

Local installation ("bare-metal")

Is your data BIDS valid?

Yes

Are you reusing any previously computed results?

FreeSurfer

Please copy and paste any relevant log output.


Additional information / screenshots

No response

mharms avatar Aug 15 '25 22:08 mharms

  1. For some reason, the anat/*_desc-ribbon_mask.nii.gz file was the only .nii.gz file that was not binary identical (i.e., did not have identical MD5 hash). Comparison of the two files using FreeSurfer's mri_diff --no-exit-on-diff reported no differences in either the NIFTI header or intensity values.

Could you share the two mask files? I can try to look.

  1. The func/*_desc-confounds_timeseries.tsv files differed per diff. Closer examination revealed that the cause of this difference was that the order of the <var>_derivative1_power2 and <var>_power2 variables was swapped for several of the variables in the tsv. (csf, white_matter, rot_x and rot_z, in one comparison). Within a given labelled column, the values were the same between the two executions.

I suspect that somewhere an unordered container like set() is used. The Python set() depends on Python's hash seed. Can look into this.

effigies avatar Sep 09 '25 12:09 effigies