spatialdata-plot
spatialdata-plot copied to clipboard
figures sharpness
Hi,
The figures plotted by spatialdata is not as clear as that of squidpy (perhaps because the scatter point size is too small, but I didn't find where to adjust the point size), and spatialdata_plot is a lot more slowly than squidpy when plotting. how to improve it ?
Here's figures from spatialdata tutorial: https://spatialdata.scverse.org/en/latest/tutorials/notebooks/notebooks/examples/squidpy_integration.html
Both the performance and the visibility of the signal from the plot will be addressed when the bug described in the linked comment is closed, which will enable using datashader as a plotting backend.
To be precise, the first plot will still be faster (it takes less than one second), but now the second plot will take only ~8s, as opposed to ~1m. Also, now the plot will be more interpretable as datashader addresses many plotting pitfalls. For instance the first plot appears to be oversaturated.
Here is how the second plot looks like now:
Anyway, if you don't want to use datashader and want to manually change the point size, please consider the following:
- the size of the cells was indeed too small, as explained here https://github.com/scverse/spatialdata-io/blob/258e080fe2f992e489e11b0a5e64b0a36db81d89/src/spatialdata_io/readers/xenium.py#L99 as it was approximating the nuclei instead of the cells. Now this has changed to the radius of the cells so freshly loaded datasets should appear more saturated also with the matplotlib-based spatialdata plotting method. Please note that when you Zoom in and you want accurate plotting, plotting from the polygons is preferred to plotting from circles/points.
- you can adjust the point size by continuing plotting with
squidpy, or by callingget_centroids()to obtain a spatialdata points element. This can be plotted usingpl.render_ploints()which has a size parameter.
Please notice that if you want to plot from polygons or points you need to change the table annotation target, as explained here: https://github.com/scverse/spatialdata-io/blob/fd3caaf37f35941a16a22cd4d51012416c4f8c3a/src/spatialdata_io/readers/xenium.py#L138.
@LucaMarconato
I have tried the latest spatialdata_plot, now the default method is datashader. Here is the figure generated by datashader:
And this is generated by
matplotlib:
And this is from squidpy:
It is expect to be the same except the running time, but as you can see the two figures(datashader and matplotlib) differs a lot ! The results of matplotlib and squidpy looks more similiar. Which one is right? What is more strange is that the running time of datashader is 180s, comparing with 105s of matplotlib. It seems that datashader is even more slowly than matplotlib. There are 250,000 cells in my dataset.
Then I changed the region from "cell_circles" to "cell_boundaries", the figure looks clearly. Is it better to plot "cell_boundaries" instead of "cell_circles" ?
sdata["table"].obs["region"] = "cell_boundaries"
sdata.set_table_annotates_spatialelement("table", region="cell_boundaries")
sdata.pl.render_shapes("cell_boundaries", color="n_counts").pl.show()
Another little issue: Now there are no x_axis and y_axis labels on the figures.
Thanks for the details.
- The performance problem of datashader is being addressed here: https://github.com/scverse/spatialdata-plot/pull/309. Can you try it you please?
- The difference in plot is being investigated here: https://github.com/scverse/spatialdata-plot/issues/311, please follow the discussion there for updates.
Another little issue: Now there are no x_axis and y_axis labels on the figures.
I could not reproduce, please open a second issue for tracking this other problem and add some details in which code you used. In the plots I made the ticks are rendered correctly.
Is it better to plot "cell_boundaries" instead of "cell_circles" ?
Yes, it is better to plot the cell_boundaries as cell_circles are approximations of the cells. Nevertheless, plotting the second is generally faster, so for quick glances at the data is probably preferred.
Is it better to plot "cell_boundaries" instead of "cell_circles" ?
Yes, it is better to plot the
cell_boundariesascell_circlesare approximations of the cells. Nevertheless, plotting the second is generally faster, so for quick glances at the data is probably preferred.
I mean there are no axis names, such as "spatial1" for x-axis and "spatial2" for y-axis. I cannot set the axis names as I need. Please refer to https://github.com/scverse/spatialdata-plot/issues/320
Thanks for the details.
- The performance problem of datashader is being addressed here: speed up datashader by using canvas size equal to image size #309. Can you try it you please?
- The difference in plot is being investigated here: Wrong colors when using
method='datashader'#311, please follow the discussion there for updates.Another little issue: Now there are no x_axis and y_axis labels on the figures.
I could not reproduce, please open a second issue for tracking this other problem and add some details in which code you used. In the plots I made the ticks are rendered correctly.
I don't know which plot is the correct one, could you tell me the datashader or matplotlib should be used now ? Or the issues will be fixed next release ? If so, I well wait to the next release.
We are trying to fix the issue by the next release. I would suggest to use matplotlib until then. CC @timtreis