DEGreport icon indicating copy to clipboard operation
DEGreport copied to clipboard

Can we order the top N genes within degplot

Open klai001 opened this issue 4 years ago • 5 comments

Hi Lorena I noticed that DEGplot() orders the top N genes alphabetically. I was wondering if there's an argument I can call upon to order the gene plots of N (e.g. top 20) according to the degree of variation it has instead of alphabetical order of the gene names

klai001 avatar May 30 '20 03:05 klai001

Hi!

thank you for the comment. Sadly there is no way right now, but I can make a quick change on Monday to allow this.

Thank you for the idea!

lpantano avatar May 30 '20 14:05 lpantano

@klai001, you may want to try this new version, it may do what you expect. You will need to install with BiocManager::install('lpantano/DEGreport'), hopefully there is no conflicts.

lpantano avatar Jun 01 '20 21:06 lpantano

I forgot to say, that you would need to get the gene names first and sort them in the order that you wish, and then use the genes= parameter in the function to plot them.

lpantano avatar Jun 01 '20 21:06 lpantano

Hi Lorena thanks for the parameter 👍 I tried to sieve out the variable genes and order them in the order but i realised im getting totally different results of top genes from the plot? I wonder if it's my way of sieving out the variance and ordering them wrong. Im missing something but not sure what it is. Plot1- before adding in the genes= parameter c<-degPlot(dds=dds, n=50, xs = "group",group="group",groupLab = "sampletype",ann = c("gene_id","symbol"),color="Accent")

Plot 2-after adding in the genes= parameter ` normcounts<-counts(dds,normalized=T)

var_genes <- apply(normcounts, 1, var)

select_var <- names(sort(var_genes, decreasing=TRUE))[1:50]

c2<-degPlot(dds=normcounts, n=50, xs = "group",group="group",groupLab = "sampletype",ann = c("gene_id","symbol"),color="Accent",genes=select_var) `

klai001 avatar Jun 04 '20 10:06 klai001

Hi,

I cannot see the plots. But you are not going to get the same genes with this two commands. The first plot the top significant genes according to p-adj value. And the other just the top variable, that is not the same.

There is something odd in the first command, did you forget to put here res=? because with this function you need or res or genes, otherwise it shouldn't work.

Anyway, these two commands won't give you the same results. In the first command the top significant genes are expressed, in the other the top variable. That is not the same. If you could same me a reproducible code, I could try to give more tips.

o, with genes you don't need res and with res you need n. If you want a particular order, you need to do the calculation outside and give genes. Right now, if you use res and n it will plot and sort by p-adj if you installed the latest change.

This will plot with -adj and FC:

res <- res[order(res$padj),] %>% .[!is.na(res$padj),][1:10,]
res <- res[order(res$log2FoldChange),] 
genes=rownames(res) # -> give this to the function

I hope this helps.

lpantano avatar Jun 05 '20 21:06 lpantano