Changing order of columns in confounds tsv when trying to generate identical output
What happened?
We ran a test of our ability to generate identical (deterministic) output when using --skull-strip-fixed-seed + --omp-nthreads 1 + a matched --random-seed value. Discounting for date/time-of-execution information in the .gii files, and presumably the xfm.h5 files as well, the results were indeed "identical", except with the following oddities:
- For some reason, the
anat/*_desc-ribbon_mask.nii.gzfile was the only.nii.gzfile that was not binary identical (i.e., did not have identical MD5 hash). Comparison of the two files using FreeSurfer'smri_diff --no-exit-on-diffreported no differences in either the NIFTI header or intensity values. - The
func/*_desc-confounds_timeseries.tsvfiles differed perdiff. Closer examination revealed that the cause of this difference was that the order of the<var>_derivative1_power2and<var>_power2variables was swapped for several of the variables in the tsv. (csf, white_matter, rot_x and rot_z, in one comparison). Within a given labelled column, the values were the same between the two executions.
Just mentioning this here for documentation purposes. Not sure what could be the source of (1). Perhaps (2) is a race condition of some sort in the manner in which the column order in the tsv files gets set?
What command did you use?
/opt/conda/envs/fmriprep/bin/fmriprep /data /out participant --participant-label sub-MA15250 --bids-filter-file /work/bids_filter_ses-20230815.json --fs-license-file /fs/license.txt --fs-subjects-dir /fs_subjects --work-dir /work --nthreads 1 --omp-nthreads 1 --mem_mb 24000 --output-spaces T1w MNI152NLin6Asym fsaverage6 --skip-bids-validation --return-all-components --cifti-output 91k --project-goodvoxels --ignore slicetiming --force bbr --fs-no-resume --notrack --stop-on-first-crash --write-graph --skull-strip-fixed-seed --random-seed 40939 --verbose
What version of fMRIPrep are you running?
25.1.1
How are you running fMRIPrep?
Local installation ("bare-metal")
Is your data BIDS valid?
Yes
Are you reusing any previously computed results?
FreeSurfer
Please copy and paste any relevant log output.
Additional information / screenshots
No response
- For some reason, the
anat/*_desc-ribbon_mask.nii.gzfile was the only.nii.gzfile that was not binary identical (i.e., did not have identical MD5 hash). Comparison of the two files using FreeSurfer'smri_diff --no-exit-on-diffreported no differences in either the NIFTI header or intensity values.
Could you share the two mask files? I can try to look.
- The
func/*_desc-confounds_timeseries.tsvfiles differed perdiff. Closer examination revealed that the cause of this difference was that the order of the<var>_derivative1_power2and<var>_power2variables was swapped for several of the variables in the tsv. (csf, white_matter, rot_x and rot_z, in one comparison). Within a given labelled column, the values were the same between the two executions.
I suspect that somewhere an unordered container like set() is used. The Python set() depends on Python's hash seed. Can look into this.