Faster implementation available for `vectorize()`
During the Ghent Hackathon (BioHackrXiv here) a faster implementation for vectorize() has been developed by @hey2homie: https://github.com/saeyslab/VIB_Hackathon_June_2024/blob/main/polygons/polygons_test.ipynb.
The implementation also fixes https://github.com/scverse/spatialdata/issues/583, but relies on opencv, which is a heavy dependency.
Still, we should see if that code could help improving the performance.
@hey2homie I won't have the time to check your approach soon, but if you think that the same approach could be used without using opencv and are willing to open a PR, your contribution would be very welcome 😊
@LucaMarconato, thanks for reaching out! Was a bit busy lately to follow-up on this. There were some minor issues in the code from the hackathon, but I've already fixed them. I will work a bit more on this to polish and submit PR.
By the way, the speed bust most like coming from not using chunking with dask arrays but simply using numpy as both opencv and skimage have almost identical performance. So, abandoning opencv in favor to skimage wouldn't be a problem!
Thank you for the explanation, and thank you in advance for the PR; it is very appreciated!
is this related to #560 ?
Yes, #560 was the original implementation but a faster one has been developed from @hey2homie during the Ghent hackathon.