Large number of significant interactions
hello, I'm using cell2cell to identify interactions with scRNAseq data.
interactions = c2c.analysis.SingleCellInteractions(rnaseq_data=rnaseq,
ppi_data=lr_pairs,
metadata=meta,
interaction_columns=(
'source_genesymbol', 'target_genesymbol'),
communication_score='expression_gmean',
cci_score='bray_curtis',
cci_type='directed',
aggregation_method='average',
barcode_col='index',
celltype_col=args.annot,
complex_sep='_',
verbose=True)
When I look at the number of interactions per cell type, i.e. looking at ccc_permutation_pvalues using p < 0.05. I see that I get a very large number of interactions per cell type pair (up to 1800 interactions), compared when using LIANA. I'm wondering why that could be.
Hi @joan-yanqiong!
Yeah, I think that's something that happens due that cell2cell does not filter genes by the fraction of cells that are expressing them. LIANA does that, through the expr_prop parameter. The default value LIANA uses is 0.1 I think, so that means that it will consider only genes that are expressed above 10% of the single cells in a cell type. So at the end of the day, the pool of LR pairs will be smaller for the dataset, and therefore you will get a smaller number of significant interactions.
Thank you for the very fast response. Okay that makes sense, so the interactions that I get with cell2cell are not necessarily wrong. Is there a way to make it more stringent to reduce the number of interactions?
No, unfortunately I developed this before most recent tools came out, and I haven't had time to implement these filtering steps they include. I plan to add more stringent filtering at some point.
I was thinking of maybe using the scores as a filtering step to reduce the number of interactions: interactions.interaction_space.interaction_elements["communication_matrix"]. Would that make sense?
Hello @joan-yanqiong
I know its been a while now, do you remember how you went about the filtering of the dataset?
Best