spatialdata-plot
spatialdata-plot copied to clipboard
datashader speedup and bugfixes
This speeds everything up for me locally, but for the benchmark (#296), I see an effect (aka datashader faster than matplotlib) for e.g. 10k points/shapes etc.
Codecov Report
Attention: Patch coverage is 83.43195% with 28 lines in your changes missing coverage. Please review.
Project coverage is 83.36%. Comparing base (
80c6f77) to head (73568ec).
| Files with missing lines | Patch % | Lines |
|---|---|---|
| src/spatialdata_plot/pl/render.py | 76.59% | 22 Missing :warning: |
| src/spatialdata_plot/pl/utils.py | 91.78% | 6 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## main #309 +/- ##
==========================================
- Coverage 83.73% 83.36% -0.37%
==========================================
Files 8 8
Lines 1543 1653 +110
==========================================
+ Hits 1292 1378 +86
- Misses 251 275 +24
| Files with missing lines | Coverage Δ | |
|---|---|---|
| src/spatialdata_plot/pl/basic.py | 89.26% <ø> (ø) |
|
| src/spatialdata_plot/pl/render_params.py | 100.00% <100.00%> (ø) |
|
| src/spatialdata_plot/pl/utils.py | 77.48% <91.78%> (+1.11%) |
:arrow_up: |
| src/spatialdata_plot/pl/render.py | 90.25% <76.59%> (-4.47%) |
:arrow_down: |
Thanks @Sonja-Stockhaus ! I have tried the PR on the Visium HD data and the speed up is significant! I switched back to using datashader instead of matplotlib in the notebook 😊
Sonja, I found a bug with this PR when the rasterized object needs to be aligned with the rest https://github.com/scverse/spatialdata-plot/issues/291. The solution seems straightforward, I think one just needs to initialize the image element from datashader using the old coordinate transformations.
Will look into it!
Thanks! Ready for review correct?
Will try to find some time today to deal with
Sonja, I found a bug with this PR when the rasterized object needs to be aligned with the rest https://github.com/scverse/spatialdata-plot/issues/291. The solution seems straightforward, I think one just needs to initialize the image element from datashader using the old coordinate transformations.
and will ping you afterwards 👍 Just marked it as "ready" to see if the tests pass.
Which dataset did you use? VisHD?
Thanks 😊 You can use the code from this branch: https://github.com/scverse/spatialdata-plot/pull/295. But Visium HD should also be a good candidate I believe.
@timtreis ready for review
Hi @timtreis, checking the status of this PR. There are still some open tasks on this PR right? Can you list them here as tickable tasks please?
Todo:
- [ ] behavior of alpha when rendering shapes should be more similar to mpl
- [ ] outline of polygons need to be implemented in datashader
- [ ] look into aggregation methods that aren't working rn (mode, any, std, var)
- [ ] fix failing tests
Thanks for the update!
outline of polygons need to be implemented in datashader
This task here seems tricky. I would consider raising a warning saying that the outline is currently not supported and workin on this on a separate PR (with low priority).
outline of polygons need to be implemented in datashader
This task here seems tricky. I would consider raising a warning saying that the outline is currently not supported and workin on this on a separate PR (with low priority).
If you use cvs.line instead of cvs.polygons you get only the outline, so you should be able to overlay the two. Depending on how fast we want to merge this PR, we can of course move that to a new one
Ah ok, then it should be a fast addition, thanks for the update.
General problem of datashader so far: when you render elements, coloring them by a value and a color map, you wouldn't see a single element if all of them have the same value X. This is due to the scaling of the data for the colormap that happens in datashader (scaled_data = (data - span[0])/(span[1] - span[0])) which leads to all elements being assigned alpha=0.
Fix in this PR: internally, not the whole colormap is passed to datashader, but just the color given by cmap(0.0). It is used to color all elements. Still, a colorbar of the cmap is drawn with the range [X, X+1].
Thanks for the explanation, I think the approach that you implemented is a good one!
As discussed with @timtreis, we now use sum as default datashader aggregation for points and mean for shapes