Memory requirement spike for `gandlf_preprocess`
Describe the bug
When gandlf_process is run (with normalize and crop_external_zero_plane as preprocessing parameters), the process runs fine for a validation csv with ~180 subjects, but fails with OOM error for training csv with ~800 subjects using the BraTS data.
To Reproduce
- Construct a CSV of ~180 BraTS subjects, and another with ~720 subjects (copy-pasting the same 180 cases with different subject IDs should reproduce the error).
- Construct a config (model and training parameters don't matter) but have this key for preprocessing:
data_preprocessing:
{
'normalize_nonZero',
'crop_external_zero_planes',
}
patch_sampler:
{
'type': 'label',
}
- Run the
gandlf_preprocessscript for both these cases. - See it pass for the one with ~180 cases and fail for ~720 cases. This is with 250G RAM.
Expected behavior It should run for both.
Screenshots N.A.
GaNDLF Version
0.0.18-dev
Desktop (please complete the following information): N.A.
Additional context Memory profiler (thanks @hasan7n): https://pypi.org/project/memory-profiler/
Is the corresponding BraTS data publicly available? Can you provide it also, please?
You should be able to download the data here: https://www.synapse.org/brats
And this should be replicable even on the unit testing data [ref].
Do you think you can include the report from #806 into your fix as well (since both related to memory consumption)?
Stale issue message
This is still under investigation.