nnUNet icon indicating copy to clipboard operation
nnUNet copied to clipboard

Make nnUNet FIPS friendly

Open ananyaananth29 opened this issue 3 months ago • 2 comments

Description I am running nnUNetv2_train on a FIPS-enabled HPC cluster. The same jobs run successfully on another cluster without FIPS enforcement, but fail on server whenever multiprocessing workers for data augmentation or validation are spawned.

The error we consistently see is:

RuntimeError: One or more background workers are no longer alive. Exiting. Please check the print statements above for the actual error message

This occurs very early, during dataloader initialization. But forcing

export nnUNet_n_proc_DA=0
export nnUNet_n_proc_val=0

allows the training to run, but disables parallel data augmentation and significantly slows down the pipeline.

Context / Debugging so far

  • Python: 3.12.4
  • nnU-Net: v2 (installed in a venv)
  • Libraries: versions match between the FIPS-enabled and non-FIPS environments.
  • Same dataset, same parameters run fine on non-FIPS, fail on FIPS-enabled environment.

There are others facing the same issue:

They attempted the fixes suggested there but they did not resolve the issue.

Request Could nnU-Net and/or its dependencies be reviewed for FIPS compliance issues? Specifically, is there a way to make the multiprocessing dataloader/augmentation workers compatible with FIPS environments?

Workaround Currently the only working workaround is:

export nnUNet_n_proc_DA=0
export nnUNet_n_proc_val=0

but this removes multiprocessing and significantly reduces training speed.

Impact Any FIPS-enabled HPC environment (common in federally regulated contexts) cannot use nnU-Net efficiently at present.

ananyaananth29 avatar Sep 15 '25 16:09 ananyaananth29

Recent findings: nnU-Net (or dependencies) must not rely on MD5 in multiprocessing, because it is not FIPS-compliant. Switching to SHA256 or another FIPS-approved hash would fix the issue.

ananyaananth29 avatar Sep 17 '25 21:09 ananyaananth29

Could you be more specific on what the issue is with nnUNet's code for FIPS compliance?

It runs on FIPS enabled machines. #2749 Was closed as they removed the dependency that was causing the error.

vmiller987 avatar Oct 01 '25 14:10 vmiller987