GaNDLF Memory requirement spike for `gandlf

Describe the bug When gandlf_process is run (with normalize and crop_external_zero_plane as preprocessing parameters), the process runs fine for a validation csv with ~180 subjects, but fails with OOM error for training csv with ~800 subjects using the BraTS data.

To Reproduce

Construct a CSV of ~180 BraTS subjects, and another with ~720 subjects (copy-pasting the same 180 cases with different subject IDs should reproduce the error).
Construct a config (model and training parameters don't matter) but have this key for preprocessing:

data_preprocessing:
  {
    'normalize_nonZero',
    'crop_external_zero_planes',
  }
patch_sampler:
  {
    'type': 'label', 
  }

Run the gandlf_preprocess script for both these cases.
See it pass for the one with ~180 cases and fail for ~720 cases. This is with 250G RAM.

Expected behavior It should run for both.

Screenshots N.A.

GaNDLF Version

0.0.18-dev

Desktop (please complete the following information): N.A.

Additional context Memory profiler (thanks @hasan7n): https://pypi.org/project/memory-profiler/

Jan 22 '24 20:01 sarthakpati

Is the corresponding BraTS data publicly available? Can you provide it also, please?

Feb 26 '24 07:02 VukW

You should be able to download the data here: https://www.synapse.org/brats

And this should be replicable even on the unit testing data [ref].

Do you think you can include the report from #806 into your fix as well (since both related to memory consumption)?

Feb 26 '24 13:02 sarthakpati

Stale issue message

Apr 27 '24 19:04 github-actions[bot]

This is still under investigation.

May 05 '24 06:05 sarthakpati

Memory requirement spike for `gandlf_preprocess`