Improve HoVerNet postprocessing performance
HoVerNet post-processing is an essential part of the hoverne pipeline but it is taking too much time (way more than inference itself). @JHancox has some experiences to make it run much faster but outside of MONAI. Although I am working with cuCIM team to get the necessary scikit-image functions accelerated on GPU to be able to run the postprocessing on GPU, it would be great if we can take advantage of some accelerations on CPU while GPU is busy running the inference.
CC @Nic-Ma @KumoLiu
HoVerNet post-processing is an essential part of the hoverne pipeline but it is taking too much time (way more than inference itself). @JHancox has some experiences to make it run much faster but outside of MONAI. Although I am working with cuCIM team to get the necessary scikit-image functions accelerated on GPU to be able to run the postprocessing on GPU, it would be great if we can take advantage of some accelerations on CPU while GPU is busy running the inference.
CC @Nic-Ma @KumoLiu
Happy to run through some of what I did. I used 3 main techniques: 1) Using a pool of threads to process batches of tiles concurrently 2) yielding dynamically thresholded tiles to the DataLoader 3) caching of strips of image data from the WSI for more efficient IO during item (2.
Hi @drbeh, as this became a limiting step for me also, I wanted to ask, why not use dask-image functions and dask.map_blocks here multi-thread on CPUs at least?
Hi @drbeh, as this became a limiting step for me also, I wanted to ask, why not use dask-image functions and dask.map_blocks here multi-thread on CPUs at least?
Hi @omashkartrx,
Have you tried it with dask to see if dask multi-threading can help here? since it does not circumvent python GIL and can only provide parallelism for non-Python code (including NumPy operations), I don't know how much it can help with scikit image operations. https://docs.dask.org/en/stable/scheduling.html#local-threads
@Nic-Ma, we are not using dask anywhere in MONAI, right?
I didn't use dask before.
Thanks.
Hi @drbeh and @Nic-Ma. I don't know how exactly it works, but I have used dask before with watershed. I loaded a WSI before and it was done in few seconds. Without dask, it was not even loading.
For examples: https://examples.dask.org/applications/image-processing.html https://www.kaggle.com/code/kmader/3d-image-analysis-using-dask/notebook
I am not sure what other postprocessing steps needed other than watershed, but I am sure it works with scikit-image watershed at least.
Hi @omashkartrx, thanks for sharing your experience. dask should indeed help loading WSI as it is an IO bound operation and we'd appreciate if you feel you can help us here. However, adding dask to monai dependencies should be discussed first based on the value it can bring and also exploring options in rapids might be helpful.
thanks @drbeh. I will definitely try to help, but I am not sure where to start. I will try to implement it and start a pull request to get your opinion. Thanks
Hi @drbeh and @Nic-Ma. I don't know how exactly it works, but I have used dask before with watershed. I loaded a WSI before and it was done in few seconds. Without dask, it was not even loading.
For examples: https://examples.dask.org/applications/image-processing.html https://www.kaggle.com/code/kmader/3d-image-analysis-using-dask/notebook
I am not sure what other postprocessing steps needed other than watershed, but I am sure it works with scikit-image watershed at least.
@drbeh, @Nic-Ma, @omashkartrx - I have used dask before for exactly this sort of thing and have some notebooks to show this. My GTC workshop session this year (and last year) also shows these approaches in action, but the same can be achieved using threads or processes directly, which reduces the dependencies. Processes have the advantage of not being affected by the GIL, but in practice, I don't see a huge difference between the two.
Hi @Nic-Ma @KumoLiu, the post-processing performance of hovernet is a bottleneck for using hovernet efficiently as reported by users and our usage in MONAI label. Do you think it is something that you can take a look to find room for improvement?
One approach to try is to replace scikit-image operators with OpenCV, which is what @JHancox has already used. It should give us some speedup since opencv is general much faster than skimage.
Hi @drbeh ,
1 drawback is that OpenCV is too big a package, may cause some dependency error. @wyli I don't remember clearly the reason, you told me to avoid using OpenCV before?
Thanks.
@wyli I don't remember clearly the reason, you told me to avoid using OpenCV before?
that was about ffmpeg part of opencv for video processing is with some GPL license