seurat icon indicating copy to clipboard operation
seurat copied to clipboard

Plotting with FeaturePlot and split.by: potential bias towards the number of cells per condition?

Open vkavaka opened this issue 2 years ago • 1 comments

Hello, dear Seurat community! By analyzing one of our datasets recently, we have been trying to plot the expression of certain genes per condition. Here is one of the examples: Screenshot 2022-09-25 at 14 53 07 It seems like the group 3 has a higher gene expression, as the groups 1 and 2. But, simultaneously, the group 3 has 40 000 cells, whereas the groups 1 and 2 have 20 000 each. We have been asking ourselves whether the visualization may be biased by the number of cells per condition. If yes, is there any way to normalize the output of split FeaturePlot by the number of cells per condition?

vkavaka avatar Sep 25 '22 13:09 vkavaka

Hi,

Not member of dev team but hopefully can be helpful. So yes if there are more cells (especially by magnitude you are describing) it can make FeaturePlot slightly misleading. As there may appear to be more positive dots. This can also be enhanced by the Seurat default which is order = FALSE parameter which means that the expressing cells can be hidden behind non-expressing cells.

To compare expression levels you may be better off with VlnPlot visualization or perhaps DotPlot so you can see relative (scaled) expression but also percent expression.

Of course also remember that differences displayed in any of these plots are not meaningful in terms of difference of expression between groups unless DE analysis also bears out there is difference or not.

I should also note that differences in expression level may be independent of % of cells expressing a gene. For instance it’s possible that the average level of expression in cells that express gene is equivalent between groups but the % of cells expressing gene in one group vs another may be very different. Both are important to take into account in terms of what visualization you are choosing to represent your data.

Best, Sam

samuel-marsh avatar Sep 25 '22 17:09 samuel-marsh

Hi,

I agree with Sam's comments and that FeaturePlot may not be the visualization to compare the percentage of cells expressing a certain gene, as VlnPlot and DotPlot may be more effective in doing so.

mhkowalski avatar Oct 07 '22 19:10 mhkowalski