single-cell-best-practices icon indicating copy to clipboard operation
single-cell-best-practices copied to clipboard

Better sorting of pathways in /conditions/gsea_pathway.html?

Open VladimirShitov opened this issue 7 months ago • 0 comments

The following code block can lead to an analysis problem:

gsea_results = (
    pd.concat({"score": scores.T, "norm": norm.T, "pval": pvals.T}, axis=1)
    .droplevel(level=1, axis=1)
    .sort_values("pval")
)

Sometimes, when the effect is strong, many p-values are set to 0. Sorting them does not rank gene sets properly. I suggest sorting by the normalized score instead. In this case the last line should be: .sort_values("norm", key=np.abs, ascending=False)

Additionally, I would suggest plotting scores instead of p-values or at least showing the directionality of the score by color

VladimirShitov avatar Jul 24 '24 15:07 VladimirShitov