spatialdata-plot icon indicating copy to clipboard operation
spatialdata-plot copied to clipboard

figures sharpness

Open wangjiawen2013 opened this issue 1 year ago • 10 comments
trafficstars

Hi, The figures plotted by spatialdata is not as clear as that of squidpy (perhaps because the scatter point size is too small, but I didn't find where to adjust the point size), and spatialdata_plot is a lot more slowly than squidpy when plotting. how to improve it ? Here's figures from spatialdata tutorial: https://spatialdata.scverse.org/en/latest/tutorials/notebooks/notebooks/examples/squidpy_integration.html image

wangjiawen2013 avatar Aug 09 '24 06:08 wangjiawen2013

Both the performance and the visibility of the signal from the plot will be addressed when the bug described in the linked comment is closed, which will enable using datashader as a plotting backend.

To be precise, the first plot will still be faster (it takes less than one second), but now the second plot will take only ~8s, as opposed to ~1m. Also, now the plot will be more interpretable as datashader addresses many plotting pitfalls. For instance the first plot appears to be oversaturated.

Here is how the second plot looks like now: image

LucaMarconato avatar Aug 09 '24 13:08 LucaMarconato

Anyway, if you don't want to use datashader and want to manually change the point size, please consider the following:

  1. the size of the cells was indeed too small, as explained here https://github.com/scverse/spatialdata-io/blob/258e080fe2f992e489e11b0a5e64b0a36db81d89/src/spatialdata_io/readers/xenium.py#L99 as it was approximating the nuclei instead of the cells. Now this has changed to the radius of the cells so freshly loaded datasets should appear more saturated also with the matplotlib-based spatialdata plotting method. Please note that when you Zoom in and you want accurate plotting, plotting from the polygons is preferred to plotting from circles/points.
  2. you can adjust the point size by continuing plotting with squidpy, or by calling get_centroids() to obtain a spatialdata points element. This can be plotted using pl.render_ploints() which has a size parameter.

Please notice that if you want to plot from polygons or points you need to change the table annotation target, as explained here: https://github.com/scverse/spatialdata-io/blob/fd3caaf37f35941a16a22cd4d51012416c4f8c3a/src/spatialdata_io/readers/xenium.py#L138.

LucaMarconato avatar Aug 09 '24 13:08 LucaMarconato

@LucaMarconato I have tried the latest spatialdata_plot, now the default method is datashader. Here is the figure generated by datashader: image And this is generated by matplotlib: image And this is from squidpy: image

It is expect to be the same except the running time, but as you can see the two figures(datashader and matplotlib) differs a lot ! The results of matplotlib and squidpy looks more similiar. Which one is right? What is more strange is that the running time of datashader is 180s, comparing with 105s of matplotlib. It seems that datashader is even more slowly than matplotlib. There are 250,000 cells in my dataset.

wangjiawen2013 avatar Aug 12 '24 08:08 wangjiawen2013

Then I changed the region from "cell_circles" to "cell_boundaries", the figure looks clearly. Is it better to plot "cell_boundaries" instead of "cell_circles" ?

sdata["table"].obs["region"] = "cell_boundaries"
sdata.set_table_annotates_spatialelement("table", region="cell_boundaries")
sdata.pl.render_shapes("cell_boundaries", color="n_counts").pl.show()

image

wangjiawen2013 avatar Aug 12 '24 09:08 wangjiawen2013

Another little issue: Now there are no x_axis and y_axis labels on the figures. image

wangjiawen2013 avatar Aug 12 '24 09:08 wangjiawen2013

Thanks for the details.

  • The performance problem of datashader is being addressed here: https://github.com/scverse/spatialdata-plot/pull/309. Can you try it you please?
  • The difference in plot is being investigated here: https://github.com/scverse/spatialdata-plot/issues/311, please follow the discussion there for updates.

Another little issue: Now there are no x_axis and y_axis labels on the figures.

I could not reproduce, please open a second issue for tracking this other problem and add some details in which code you used. In the plots I made the ticks are rendered correctly.

LucaMarconato avatar Aug 12 '24 12:08 LucaMarconato

Is it better to plot "cell_boundaries" instead of "cell_circles" ?

Yes, it is better to plot the cell_boundaries as cell_circles are approximations of the cells. Nevertheless, plotting the second is generally faster, so for quick glances at the data is probably preferred.

LucaMarconato avatar Aug 12 '24 12:08 LucaMarconato

Is it better to plot "cell_boundaries" instead of "cell_circles" ?

Yes, it is better to plot the cell_boundaries as cell_circles are approximations of the cells. Nevertheless, plotting the second is generally faster, so for quick glances at the data is probably preferred.

I mean there are no axis names, such as "spatial1" for x-axis and "spatial2" for y-axis. I cannot set the axis names as I need. Please refer to https://github.com/scverse/spatialdata-plot/issues/320

wangjiawen2013 avatar Aug 13 '24 03:08 wangjiawen2013

Thanks for the details.

Another little issue: Now there are no x_axis and y_axis labels on the figures.

I could not reproduce, please open a second issue for tracking this other problem and add some details in which code you used. In the plots I made the ticks are rendered correctly.

I don't know which plot is the correct one, could you tell me the datashader or matplotlib should be used now ? Or the issues will be fixed next release ? If so, I well wait to the next release.

wangjiawen2013 avatar Aug 13 '24 03:08 wangjiawen2013

We are trying to fix the issue by the next release. I would suggest to use matplotlib until then. CC @timtreis

LucaMarconato avatar Aug 13 '24 10:08 LucaMarconato