single-cell-best-practices
single-cell-best-practices copied to clipboard
Better sorting of pathways in /conditions/gsea_pathway.html?
The following code block can lead to an analysis problem:
gsea_results = (
pd.concat({"score": scores.T, "norm": norm.T, "pval": pvals.T}, axis=1)
.droplevel(level=1, axis=1)
.sort_values("pval")
)
Sometimes, when the effect is strong, many p-values are set to 0. Sorting them does not rank gene sets properly. I suggest sorting by the normalized score instead. In this case the last line should be:
.sort_values("norm", key=np.abs, ascending=False)
Additionally, I would suggest plotting scores instead of p-values or at least showing the directionality of the score by color