gggenomes icon indicating copy to clipboard operation
gggenomes copied to clipboard

Scale on ribbon

Open Daniel-Ze opened this issue 2 years ago • 4 comments

Hi, lately I have been playing around with the amazing gggenomes library. I'm interested in visualizing gene counts within non overlapping sliding windows. Similar to the GC content example in the documentation I wanted to plot the data in a ribbon above and below the chromosomes / genomic regions plus some extra ribbons representing alternative treatments.

Is there a way to show a scale next to the ribbons that I drew in this picture?

I know there are some overlaps that I still have to work out. The scale next to the ribbons would be one step closer to a perfect plot :) It would be amazing if someone could help me out with this!

Below is some example code to show how I generated the red ribbon plot. I added all the other ribbons in the same way adjusting their y value and their fill color.

seqs = fasta index file read in with read_fai() mapping = paf file generated with minimap2 read in with read_paf() gene_cluster = bed file with genes per window count read in with read_bed()

p1<-gggenomes(seqs = seqs, links = mapping)+
  geom_seq()+
  geom_link(aes(fill=blast, color=blast), alpha=0.8, offset = c(0.3, 0.3))+
p2<- p1 %>% add_feats(gene_cluster)+ 
  new_scale("color")+
  geom_ribbon(aes(x=(x+xend)/2, 
                  ymax=y, 
                  ymin=y-(0.03*score),
                  group=seq_id,
                  linetype="Genes"), 
              feats(gene_cluster),
              fill="red", 
              linetype=1,
              size=0.1,
              color="black",
              alpha=.5)

continuing like this for all the ribbons shown.

Cheers, Daniel

Daniel-Ze avatar Mar 21 '22 18:03 Daniel-Ze

Fancy! Good question. What kind of scale do you mean? A complete axis, or a small scale bar with fixed size for comparison. Maybe akin to #85 geom_scale_bar but on the y-axis matching the ribbon scale?

On a side note, check out gggenomes::geom_wiggle() which I've added a while back. It wraps geom_ribbon(), so that you don't need to add all the y,ymax,x,xend, ....

thackl avatar Mar 21 '22 19:03 thackl

Thanks a lot for the geom_wiggle() tip! Indeed makes it much easier to plot the ribbons. I tried the "line" and "linerange" option which gives you a legend for the plotted data. This is already pretty nice! Below is the "line" version as I think the "linerange" is a bit confusing in a whole genome context. To elaborate the graph a bit: The upper ribbon is actual annotated genes of a gene family and the lower part is expressed genes of the same gene family from IsoSeq data. example_sw_wiggle

However, the problem I encounter is that each ribbon gets its own legend scale. It should be a shared one among all the ribbons.

I guess an added dedicated axis for each ribbon would be the easiest for people to understand what they are looking at there.

Thank you for your help!

Cheers, Daniel

Daniel-Ze avatar Mar 22 '22 11:03 Daniel-Ze

A little update to this. I had some time to play around with this and I got a bit further with the gggenomes builtin aesthetics:

  • First I draw the bare genome
  • Then I add the feature track plus a bar plot
  • To this I add the axis information plus a continuous scale that takes the max from the added feature track.
p_chr <- gggenomes(seqs = genome)+
  geom_seq()

p_chr %>% add_feats(cluster)+
  geom_bar(aes(x=(x+xend)/2,
               y = score), 
              feats(cluster),
              fill="orange",
              linetype=1,
              size=0.1,
              alpha=1, 
              stat = "identity")+
  theme(axis.line.y.left = element_line(),
        axis.ticks.y = element_line(),
        axis.text.y = element_text(),
        axis.line.x = element_blank(),
        axis.ticks.x = element_blank(),
        axis.text.x = element_blank())+
  scale_y_continuous(limits = c(0, max(cluster$score)),breaks = seq(0,max(cluster$score), 2))

image

My question would be if there is a way to move the bare genome to 0. It seems to sit always at 1 at the Y-axis. I haven't tried to add this to a bigger gggenomes plot.

Daniel-Ze avatar Jun 30 '22 14:06 Daniel-Ze

Here's a solution, although granted, not the prettiest ...

p1 <- gggenomes(emale_genes, emale_seqs) |> add_clusters(emale_cogs) +
  geom_seq() + geom_gene()

# default plot with first genome at y==1
p1 + theme_bw()

image

p2 <- p1
# change y-coords of seqs by hand
p2 <- set_seqs(p2, get_seqs(p2) |> mutate(y=y-1)) 
# relayout other features (genes, etc)
# BUG: this should work but ignore_seqs not propaged
# p2 <- layout(p2, ignore_seqs=TRUE) 
# this works
p2$data <- layout(p2$data, ignore_seqs=TRUE)

# plot with genomes shifted on y axis, i.e. first genomes at y==0
p2 + theme_bw()

image

thackl avatar Oct 15 '22 04:10 thackl