How to estimate spot.diameter and spot for stereo-seq data
Hi,
I am using CellChat-V2 for the analysis of stereo-seq (cell-bin) data. Could you provide guidance on how to estimate the 'spot.diameter' and 'spot' parameters within 'scale.factors', as these are required by the 'createCellChat' function?
Thank you in advance for your assistance.
Best, Leon
Same request here, it will be good if CellChat considers adding analysis pipeline for spatial data with single-cell resolution.
@LeonSong1995 Hello! Thanks for your attention to CellChat. As for the stereo data, we have figure out how to compute communication probability with a true scale.factor for the bin-based approach. I will give a reference on the cell-bin-based approach next week.🥹please wait for a few days.
@HelloWorldLTY For spatial-data with single-cell resolution like Slide-seq2 and Seq-Fish, the unit of spatial-coordinates are already in um, and thus you can simply set spot.diameter = 10 um (this is for Slide-seq2) and spot = spot.diameter.
@suye0620 Hi, could you also help me to figure out the spot.diameter for MERFISH, STARmap?
@wwang-chcn Not sure if you have enough signaling genes in merfish data that you can perform cell-cell communication analysis. I would encourage and appreciate you to check the the unit of spatial-coordinates for these techniques.
@LeonSong1995 Hi~ o( ̄▽ ̄)ブ
Firstly, CellChat uses real distances (measured by the International System of Units, such as $\mu m$ & nm...) as distance constraints when calculating the probability of cell communication, so we must check the distance units in the dataset. In my experience, the coordinates‘ units in a stereo expression dataset can be either intuitive or (Fig 1, the minimum distance in the coordinate is 50px*500nm=25000nm=25 $\mu m$, so the unit is "px" ) or not intuitive(In some cases, the coordinate representation in a stereo data has been simplified, so the minimum distance in the coordinate can be "1", instead of "50px", even if its bin size is 50. So the unit will be "1", not intuitive). When working on spatial transcriptomic data and preferring to infer interactions between individual cell, we need to set scale.factors correctly with a correct understanding of the unit.
Secondly, we introduce the scale.factors and emphasize the two important parameters. A scale.factor is a list object in R, containing the corresponding relationship between distance representations in a spatial transcriptomic dataset. USER must input this list when datatype = "spatial". scale.factors must contain an element named spot.diameter, which is the theoretical spot size; e.g., 10x Visium (spot.size = 65 microns), and another element named spot, which is the number of pixels that span the diameter of a theoretical spot size in the original, full-resolution image. For 10X visium, scale.factors are in the 'scalefactors_json.json'. scale.factors$spot is the 'spot.size.fullres '. For Stereo, scale.factors$spot.diameter is the binsize measured by the International System of Units, such as $\mu m$ & nm..., scale.factors$spot is the binsize calculated by the coordinate representation in a stereo data, which can be "50" (px)or "1"(abstract unit, the minimum distance in the coordinate of a dataset) when the binsize=50. So you can set scale.factors = list(spot.diameter = 25, spot = 50) or scale.factors = list(spot.diameter = 25, spot = 1) with the parameter interaction.length = 200 in computeCommunProb function.
Lastly, we use equations to make the calculation on distances more clear. We need to get the distances (denoted by $d_{(cell-to-cell)} (\mu m)$ ) measured by the International System of Units, such as $\mu m$ & nm...But through the stereo dataset, we can only get the distances(denoted by $x_{(cell-to-cell)} (pixel)$ ) measured by pixels or "1". Then we check the coordinate in stereo to get the binsize calculated by the coordinate representation (denoted by $bin.size.coordinates$ ). With the true binsize measured by the International System of Units (denoted by $bin.size.true(\mu m)$ ), we can define:
$$ \frac{ d_{(cell-to-cell)} (μm)}{bin.size.true(μm)}=\frac{x_{(cell-to-cell)} }{ bin.size.coordinates } $$
$$d_{(cell-to-cell)} (μm)=\frac{x_{(cell-to-cell)}}{bin.size.coordinates} \cdot bin.size.true$$
For example, if the binsize is 50, we get $bin.size.true=50px \times 500nm/px=25 \mu m$, $bin.size.coordinates=50px$, calculate real distances (measured by the International System of Units, such as $\mu m$ & nm...) as distance constraints, and then set the parameter interaction.length = 200 (use unit: $\mu m$ too) in computeCommunProb function.
As for cell.bin data (Fig 4), I see the binsize may be 1, so you can set scale.factors = list(spot.diameter = 0.5, spot = 1). (bin.size.coordinates=1(px))
!spot.diameter's unit is $\mu m$ , consistent with the parameter interaction.length's unit.
Tips: The stereo data is normally big and tough for PC's memory, you need to use sparse matrix computation to accelerate some functions in CellChat. We have made the CellChat adapt to stereo, but it takes more time to check and publish it. Good Luck to U when using the CellChat on Stereo! 😀
I also have a confusing question. For 10X visium, the 10X tutorials say the spot diameter is 55 . Why should we set 65 in the CellChat tutorial? Thanks!
I also have a confusing question. For 10X visium, the 10X tutorials say the spot diameter is 55 . Why should we set 65 in the CellChat tutorial? Thanks!
@Knight1995 You can check the details in https://support.10xgenomics.com/spatial-gene-expression/software/pipelines/latest/output/spatial
Thanks! And I updated Version2 recently , It may be a bug when run computeCommunProbPathway,space is a pathway name in your source code,such as '', I suggest changing the code pathways <- unique(pairLR.use$pathway_name) into pathways <- unique(pairLR.use$pathway_name[pairLR.use$pathway_name != ""]) to avoid this bug.
@Knight1995 The database issue has been fixed and now each L-R was assigned to a signaling pathway.
@LeonSong1995 Hi~ o( ̄▽ ̄)ブ Firstly, CellChat uses real distances (measured by the International System of Units, such as μm & nm...) as distance constraints when calculating the probability of cell communication, so we must check the distance units in the dataset. In my experience, the coordinates‘ units in a stereo expression dataset can be either intuitive or (Fig 1, the minimum distance in the coordinate is 50px*500nm=25000nm=25 μm, so the unit is "px" ) or not intuitive(In some cases, the coordinate representation in a stereo data has been simplified, so the minimum distance in the coordinate can be "1", instead of "50px", even if its bin size is 50. So the unit will be "1", not intuitive). When working on spatial transcriptomic data and preferring to infer interactions between individual cell, we need to set
scale.factorscorrectly with a correct understanding of the unit.
Secondly, we introduce the
scale.factorsand emphasize the two important parameters. Ascale.factoris a list object in R, containing the corresponding relationship between distance representations in a spatial transcriptomic dataset. USER must input this list when datatype = "spatial".scale.factorsmust contain an element namedspot.diameter, which is the theoretical spot size; e.g., 10x Visium (spot.size = 65 microns), and another element namedspot, which is the number of pixels that span the diameter of a theoretical spot size in the original, full-resolution image. For 10X visium,scale.factorsare in the 'scalefactors_json.json'.scale.factors$spotis the 'spot.size.fullres '. For Stereo,scale.factors$spot.diameteris the binsize measured by the International System of Units, such as μm & nm...,scale.factors$spotis the binsize calculated by the coordinate representation in a stereo data, which can be "50" (px)or "1"(abstract unit, the minimum distance in the coordinate of a dataset) when thebinsize=50. So you can setscale.factors = list(spot.diameter = 25, spot = 50)orscale.factors = list(spot.diameter = 25, spot = 1)with the parameterinteraction.length = 200incomputeCommunProbfunction.
![]()
Lastly, we use equations to make the calculation on distances more clear. We need to get the distances (denoted by d(cell−to−cell)(μm) ) measured by the International System of Units, such as μm & nm...But through the stereo dataset, we can only get the distances(denoted by x(cell−to−cell)(pixel) ) measured by pixels or "1". Then we check the coordinate in stereo to get the binsize calculated by the coordinate representation (denoted by bin.size.coordinates ). With the true binsize measured by the International System of Units (denoted by bin.size.true(μm) ), we can define:
d(cell−to−cell)(μm)bin.size.true(μm)=x(cell−to−cell)bin.size.coordinates
d(cell−to−cell)(μm)=x(cell−to−cell)bin.size.coordinates⋅bin.size.true
For example, if the binsize is 50, we get bin.size.true=50px×500nm/px=25μm, bin.size.coordinates=50px, calculate real distances (measured by the International System of Units, such as μm & nm...) as distance constraints, and then set the parameter
interaction.length = 200(use unit: μm too) incomputeCommunProbfunction.As for
cell.bindata (Fig 4), I see the binsize may be 1, so you can setscale.factors = list(spot.diameter = 0.5, spot = 1). (bin.size.coordinates=1(px))!spot.diameter's unit is μm , consistent with the parameter
interaction.length's unit.Tips: The stereo data is normally big and tough for PC's memory, you need to use sparse matrix computation to accelerate some functions in CellChat. We have made the CellChat adapt to stereo, but it takes more time to check and publish it. Good Luck to U when using the CellChat on Stereo! 😀
Thanks for your attention on Stereo-seq. Thank you for your interest in Stereo-seq. I am unclear about the connection between the list "scale.factors = list(spot.diameter = 0.5, spot = 1)" as you mentioned above and the data.frame "scale.factors = (ratio = conversion.factor, tol = spot.size/2)" in the FAQ. Could you please clarify them?
@Li-ZhiD OK, now we use two parameters in scale.factors: ratio and tol. The two parameters spot.diameter and spot which we used previously are related to the ratio parameter at present.
For example, if the binsize is 50, we can get the bin size measured by the International System of Units, bin.size.true=50 bin×500nm/bin=25μm, and bin.size in Stereo's coordinate will be 50px (one bin occupies 1px). So the ratio will be 1 pixel equals 0.5 $\mu m$ in the coordinates:
# ratio =conversion.factor = spot.size/scalefactors$spot_diameter_fullres
$$ ratio = \frac{25\mu m}{50px} = 0.5 $$
Obviously, the bin size in Stereo is the same concept as spot in 10X Visium, so tol can be set:
tol = bin.size.true/2 = 12.5
Thank you!
We used stereo-seq sequencing, binsize= 100. According to your instructions, I set scale.factors = list(spot.diameter = 50, spot = 1,ratio = 1,tol=25), But there it is. The suggested minimum value of scaled distances is in [1,2], and the calculated value here is Inf. May I ask what went wrong?
Could you please explain the mean of ratio, tol and their relationship with spot size and center-to-center size? I'm so confusing for these concepts.
My previous setting is
scale.factors = list(spot.diameter = 0.5, spot = 1)
The new setting is
scale.factors1 = data.frame(ratio = scale.factors$spot / scale.factors$spot.diameter, tol = scale.factors$spot/2)
I'm wondering whether this is correct?
I encountered error when running the following command, which works well in the previous version. Could you help me to figure out what happened and what changes should I make?
cellchat <- computeCommunProb(cellchat, type = "truncatedMean", trim = 0.1,
distance.use = TRUE, interaction.range = 200, scale.distance = 0.1)
The output:
truncatedMean is used for calculating the average gene expression per cell group.
[1] ">>> Run CellChat on spatial transcriptomics data using distances as constraints of the computed communication probability <<< [2023-12-08 14:06:11.96631]"
Warning message in min(d.spatial, na.rm = TRUE):
“no non-missing arguments to min; returning Inf”
The suggested minimum value of scaled distances is in [1,2], and the calculated value here is Inf
Warning message in min(d.spatial, na.rm = TRUE):
“no non-missing arguments to min; returning Inf”
Error in if (sum(P1_Pspatial) == 0) {: missing value where TRUE/FALSE needed
Traceback:
computeCommunProb(cellchat, type = "truncatedMean", trim = 0.1,
distance.use = TRUE, interaction.range = 200, scale.distance = 0.1)
@wwang-chcn Which types of data are you using? 10x Visium?
@sqjin Thanks for you reply. I'm using stereo-seq data with cell.bin.
I'm still confused how to set spatial.factors in stereo-seq data when used : cellchat <- CellChat::createCellChat(object = data.input, meta = meta, group.by = "labels", datatype = "spatial", coordinates = spatial.locs, spatial.factors = spatial.factors)
Do you have clear instructions on how to set these parameters at different bin and cellbin levels?

Lastly, we use equations to make the calculation on distances more clear. We need to get the distances (denoted by d(cell−to−cell)(μm) ) measured by the International System of Units, such as μm & nm...But through the stereo dataset, we can only get the distances(denoted by x(cell−to−cell)(pixel) ) measured by pixels or "1". Then we check the coordinate in stereo to get the binsize calculated by the coordinate representation (denoted by bin.size.coordinates ). With the true binsize measured by the International System of Units (denoted by bin.size.true(μm) ), we can define: