Support for scipy.sparse.COO
Hey,
I have lots of very big 3D EHR data that is also very sparse. The upcoming scipy release will feature n-D sparse COO arrays which are useful to store and retrieve big time series data. I was wondering whether you'd be open to single dispatching your implementations to scipy.sparse.COO for potentially tons of memory savings and speedups?
Do you refer to https://scipy.github.io/devdocs/reference/generated/scipy.sparse.coo_array.html#coo-array ? Most of the metrics do not support sparse inputs though. Can you elaborate on your intended usage, where in tslearn do you expect to use these inputs?
@charavelg I think you've edited your message which now has a completely different meaning. I don't get a notification if you don't post again 😄 .
Yes, I am referring exactly to that (although the ones in the upcoming release which support n-D).
Most of the metrics do not support sparse inputs though.
I think that in many cases there are ways to reformulate the problem or algorithm to also support sparse arrays. Like if your dense implementation supports zeros as values, it technically already supports sparse data - just not efficiently.
Can you elaborate on your intended usage, where in tslearn do you expect to use these inputs?
Yes, the metrics would be a good start. For example, I'd be super happy if say dtw supported sparse arrays. If you'd be open to tackling this issue, I can see whether I can come up with a POC to showcase what I mean?