decoupleR TF analysis between conditions scRNA

How can I determine if the TF activity between two conditions in my single cell dataset is significant for a gene?

Mar 16 '24 12:03 kiwipeel

Hi @kiwipeel ,

You can perform differential expression analysis at the pseudobulk levels between conditions and then use the obtained contrast level gene statistics as input for decoupler. You have an example of this workflow in this vignette. It is in python but should be relatively easy to reproduce in R if that is a limitation. Hope this is helpful!

Mar 18 '24 07:03 PauBadiaM

Hi @kiwipeel ,

You can perform differential expression analysis at the pseudobulk levels between conditions and then use the obtained contrast level gene statistics as input for decoupler. You have an example of this workflow in this vignette. It is in python but should be relatively easy to reproduce in R if that is a limitation. Hope this is helpful!

Why can't I just apply statistical tests to the score values generated from the ULM model? Thank you in advance

Mar 18 '24 08:03 kiwipeel

Hi @kiwipeel ,

You could also do that, but if the objective is to compare conditions I would recommend to go the pseudobulk route since with it you do not overinflate the p-values by considering single-cells as true replicates (which are not).

Mar 18 '24 08:03 PauBadiaM

Hi @kiwipeel ,

You could also do that, but if the objective is to compare conditions I would recommend to go the pseudobulk route since with it you do not overinflate the p-values by considering single-cells as true replicates (which are not).

Thank you. Do the p-values in the run_ulm results represent the significance of the scores for each cell and transcription factor, am I right? Why do we create a new assay from all of these scores while there are scores that don't have significant p-values?

Mar 18 '24 09:03 kiwipeel

Hi @kiwipeel ,

Indeed! We keep all of them since p-value thresholding is completely arbitrary, depending on the application you might want to use a more strict or relax threshold.

Mar 18 '24 09:03 PauBadiaM

Hi @kiwipeel ,

Indeed! We keep all of them since p-value thresholding is completely arbitrary, depending on the application you might want to use a more strict or relax threshold.

Thank you. However, if I put a threshold on the p-value, it implies that there will be missing values in the new assay we generate from the tf scores. What is the correct way to handle this?"

Mar 18 '24 10:03 kiwipeel

It really depends on the downstream task you want to use them for, in your case since you are interested in contrasting conditions I would again recommend to do it at the pseudobulk level, there the filtering by p-value is going to be easier to handle since you obtain a single vector of changes of activities that may or may not be significant.

Mar 19 '24 09:03 PauBadiaM

It really depends on the downstream task you want to use them for, in your case since you are interested in contrasting conditions I would again recommend to do it at the pseudobulk level, there the filtering by p-value is going to be easier to handle since you obtain a single vector of changes of activities that may or may not be significant.

Thank you again. One last question.. After getting TF assay from model , should I use ScaleData() function on tf assay by using split.by argument based on my conditions ?

Mar 19 '24 09:03 kiwipeel

Hi @kiwipeel , if it is just for plotting yes, I am not sure about the split.by argument though, you would want to see the differences between your conditions instead no?

Mar 19 '24 09:03 PauBadiaM