spatialdata-plot icon indicating copy to clipboard operation
spatialdata-plot copied to clipboard

datashader speedup and bugfixes

Open Sonja-Stockhaus opened this issue 1 year ago • 15 comments

This speeds everything up for me locally, but for the benchmark (#296), I see an effect (aka datashader faster than matplotlib) for e.g. 10k points/shapes etc.

Sonja-Stockhaus avatar Jul 17 '24 10:07 Sonja-Stockhaus

Codecov Report

Attention: Patch coverage is 83.43195% with 28 lines in your changes missing coverage. Please review.

Project coverage is 83.36%. Comparing base (80c6f77) to head (73568ec).

Files with missing lines Patch % Lines
src/spatialdata_plot/pl/render.py 76.59% 22 Missing :warning:
src/spatialdata_plot/pl/utils.py 91.78% 6 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #309      +/-   ##
==========================================
- Coverage   83.73%   83.36%   -0.37%     
==========================================
  Files           8        8              
  Lines        1543     1653     +110     
==========================================
+ Hits         1292     1378      +86     
- Misses        251      275      +24     
Files with missing lines Coverage Δ
src/spatialdata_plot/pl/basic.py 89.26% <ø> (ø)
src/spatialdata_plot/pl/render_params.py 100.00% <100.00%> (ø)
src/spatialdata_plot/pl/utils.py 77.48% <91.78%> (+1.11%) :arrow_up:
src/spatialdata_plot/pl/render.py 90.25% <76.59%> (-4.47%) :arrow_down:

codecov-commenter avatar Jul 17 '24 10:07 codecov-commenter

Thanks @Sonja-Stockhaus ! I have tried the PR on the Visium HD data and the speed up is significant! I switched back to using datashader instead of matplotlib in the notebook 😊

LucaMarconato avatar Aug 01 '24 12:08 LucaMarconato

Sonja, I found a bug with this PR when the rasterized object needs to be aligned with the rest https://github.com/scverse/spatialdata-plot/issues/291. The solution seems straightforward, I think one just needs to initialize the image element from datashader using the old coordinate transformations.

LucaMarconato avatar Aug 09 '24 13:08 LucaMarconato

Will look into it!

timtreis avatar Aug 14 '24 23:08 timtreis

Thanks! Ready for review correct?

LucaMarconato avatar Aug 15 '24 09:08 LucaMarconato

Will try to find some time today to deal with

Sonja, I found a bug with this PR when the rasterized object needs to be aligned with the rest https://github.com/scverse/spatialdata-plot/issues/291. The solution seems straightforward, I think one just needs to initialize the image element from datashader using the old coordinate transformations.

and will ping you afterwards 👍 Just marked it as "ready" to see if the tests pass.

Which dataset did you use? VisHD?

timtreis avatar Aug 15 '24 13:08 timtreis

Thanks 😊 You can use the code from this branch: https://github.com/scverse/spatialdata-plot/pull/295. But Visium HD should also be a good candidate I believe.

LucaMarconato avatar Aug 15 '24 14:08 LucaMarconato

@timtreis ready for review

Sonja-Stockhaus avatar Sep 12 '24 13:09 Sonja-Stockhaus

Hi @timtreis, checking the status of this PR. There are still some open tasks on this PR right? Can you list them here as tickable tasks please?

LucaMarconato avatar Sep 27 '24 08:09 LucaMarconato

Todo:

  • [ ] behavior of alpha when rendering shapes should be more similar to mpl
  • [ ] outline of polygons need to be implemented in datashader
  • [ ] look into aggregation methods that aren't working rn (mode, any, std, var)
  • [ ] fix failing tests

Sonja-Stockhaus avatar Sep 27 '24 12:09 Sonja-Stockhaus

Thanks for the update!

outline of polygons need to be implemented in datashader

This task here seems tricky. I would consider raising a warning saying that the outline is currently not supported and workin on this on a separate PR (with low priority).

LucaMarconato avatar Sep 27 '24 12:09 LucaMarconato

outline of polygons need to be implemented in datashader

This task here seems tricky. I would consider raising a warning saying that the outline is currently not supported and workin on this on a separate PR (with low priority).

If you use cvs.line instead of cvs.polygons you get only the outline, so you should be able to overlay the two. Depending on how fast we want to merge this PR, we can of course move that to a new one

Sonja-Stockhaus avatar Sep 27 '24 12:09 Sonja-Stockhaus

Ah ok, then it should be a fast addition, thanks for the update.

LucaMarconato avatar Sep 27 '24 12:09 LucaMarconato

General problem of datashader so far: when you render elements, coloring them by a value and a color map, you wouldn't see a single element if all of them have the same value X. This is due to the scaling of the data for the colormap that happens in datashader (scaled_data = (data - span[0])/(span[1] - span[0])) which leads to all elements being assigned alpha=0.

Fix in this PR: internally, not the whole colormap is passed to datashader, but just the color given by cmap(0.0). It is used to color all elements. Still, a colorbar of the cmap is drawn with the range [X, X+1].

Sonja-Stockhaus avatar Oct 02 '24 16:10 Sonja-Stockhaus

Thanks for the explanation, I think the approach that you implemented is a good one!

LucaMarconato avatar Oct 02 '24 16:10 LucaMarconato

As discussed with @timtreis, we now use sum as default datashader aggregation for points and mean for shapes

Sonja-Stockhaus avatar Oct 16 '24 17:10 Sonja-Stockhaus