Implement Argmax Inference Instead of Softmax when 'has_regions' is False
Issue: I have encountered a memory consumption issue in my Docker setup during inference tasks. Specifically, when the shm-size is set to 30GB, the system can successfully process softmax operations on input tensors of size [24, 724, 435, 435]. However, reducing the shm-size to 15GB leads to the unexpected termination of the Docker container due to insufficient shared memory during softmax computation.
Solution: Implement argmax instead of softmax when the 'has_regions' flag is set to False. This change is intended to address the shared memory (shm-size) constraints within Docker environments. It has been observed that the Docker container can handle the computation effectively, even with the shm-size set to 0, after making the proposed change.
Thanks for this PR. I am not a big fan of this to be honest because we take something that is well structured and abstracted (labelmanager is in charge) and are changing it to be more complicated and moving functionality around. The original idea was that labelmanager is the central authority deciding how predictions should be handled, so that we only need to maintain and change the way things are handled in one location. If we now start re-adding label handling into different functions in different locations then it all becomes a complicated mess again. So I'd much rather better understand the underlying reason for this problem and address it systematically. Why does your implementation not cause as much SHM as mine. Does the overall amount of RAM needed to do the export change? Why not just give the docker more SHM?
Hi,
Thank you for your feedback. I encountered this issue while preparing for AortaSeg24 Grand Challenge. According to the Grand Challenge platform's requirements for algorithms (Grand Challenge Documentation), 50% of the system memory is shared. For example, on a 16 GB instance, /dev/shm will be 8 GB, and this percentage is not modifiable. The provided instance is ml.g4dn.2xlarge with 1x NVIDIA T4 GPU, 8 vCPUs, 32 GB of memory, and 1 x 225 GB NVMe SSD. Therefore, participants can use a maximum memory size of 32 GB and an shm-size of 16 GB.
When I set up a new Docker environment, I observed a similar issue. In this Docker environment, setting the memory size to 35 GB and the shm-size to 35 GB allows the system to successfully process softmax operations on input tensors of size [24, 724, 435, 435]. However, setting the memory size to 30 GB and the shm-size to 30 GB leads to the unexpected termination of the Docker container. This suggests that inference on devices with smaller 30 GB memory sizes will encounter similar issues.
Regarding your point about label management, I agree that handling labels in one location is more efficient and avoids redundancy. However, I am currently unsure how to directly add an option for the inference step specifically within the label manager. I hope you can provide further suggestions on how to address this issue systematically.
Hey, so I would like to close this PR because it breaks with our existing design patterns to work around a problem caused by a single challenge platform. I suggest users write their own inference code based on nnU-Net's functionality to work around these problems. You can take inspiration from our submission to ToothFairy2 which had way more of these problems because of the 43 labels this dataset has: https://github.com/MIC-DKFZ/nnUNet/blob/master/documentation/competitions/Toothfairy2/inference_script_semseg_only_customInf2.py
Our script is a bit more involved than what you would need but it has some nice optimizations for runtime and memory efficiency
Hi, thanks for providing the detailed inference code for ToothFairy2. It has been really helpful for submitting inference code to the challenge platform. I will close this pull request.