gggenomes icon indicating copy to clipboard operation
gggenomes copied to clipboard

Enhancements: seq breaks symbols and scale

Open iferres opened this issue 2 years ago • 7 comments

Hi again!

Not reporting a bug, just to suggest a couple of enhancements for future releases.

  1. To be able to add seq breaks symbols as in this comment (two parallel lines ~45 degrees at the beginning/end of each break).

  2. To be able to draw a scale but not the whole x axis. This is specially useful when using focus() since it doesn't make sense to draw an axis for truncated contigs. Instead, using a small scale to compare relative sizes as usually done in phylogeny figures would be nice. For example see ggtree::geom_treescale. It is probably possible using ggplot2 magic, but it would be nice to have an example in the documentation.

Sorry for the spam :P Bests!

iferres avatar Sep 15 '21 20:09 iferres

Definitely good ideas! Challenge accepted ;)

I'm still playing around with some ideas. Would be curious about your thoughts on the following:

gggenomes(emale_genes, emale_seqs) %>% focus(name=="MCP") +
  geom_seq() + geom_gene() + geom_gene_tag(aes(label=name)) +
  geom_break() + # add // add ends of truncated seqs (alternatively: geom_seq(breaks=TRUE))
  geom_scale_bar() + no_x_axis() # add a scale bar

image

thackl avatar Sep 19 '21 19:09 thackl

Looks good!

I'm not sure about the geom_break(), since it only make sense when focus()ing, don't you think? How about focus(add_breaks=TRUE), or something like that? I'm not an expert on ggplot2's grammar, but I can't see the what would be its behaviour if don't wrapped into a focus call.

Regarding the scale bar, it also looks very good! Here I link the ggtree approach, which makes use of a custom theme if you want to remove the x axis. May be it serves you as inspiration. Using similar approaches probably helps users to find what they saw in other packages. I don't remember gggenes's approach on this, but probably theme_genes() is doing the trick.

Thank you for your interest!

iferres avatar Sep 20 '21 13:09 iferres

You don't necessarily need focus() to truncate sequences. A truncated sequence is defined by having - in addition to a length - a start >1 and/or an end < length. You can also manually set that to illustrate some more complex situations, see the example below.

s0 <- tribble(
   # start/end define regions, i.e. truncated contigs
  ~bin_id, ~seq_id, ~length, ~start, ~end,
  "complete_genome", "chromosome_1_long_trunc_2side", 1e5, 1e4, 2.1e4,
  "fragmented_assembly", "contig_1_trunc_1side", 1.3e4, .9e4, 1.3e4,
  "fragmented_assembly", "contig_2_short_complete", 0.3e4, 1, 0.3e4,
  "fragmented_assembly", "contig_3_trunc_2sides", 2e4, 1e4, 1.4e4
)

l0 <- tribble(
  ~seq_id, ~start, ~end, ~seq_id2, ~start2, ~end2,
  "chromosome_1_long_trunc_2side", 1.1e4, 1.4e4, 
    "contig_1_trunc_1side", 1e4, 1.3e4,
  "chromosome_1_long_trunc_2side", 1.4e4, 1.7e4,
    "contig_2_short_complete", 1, 0.3e4,
  "chromosome_1_long_trunc_2side", 1.7e4, 2e4,
    "contig_3_trunc_2sides", 1e4, 1.3e4
)

gggenomes(seqs=s0, links=l0) +
  geom_seq() + geom_break() + geom_seq_label(nudge_y=-.05) + geom_link()

image

focus() computes start/end for sequences based on some criteria. It also does not plot by itself. It is like mutate() for a tibble. It just adds/modifies start/end columns for sequences in a gggenomes object (and filters unused sequences). That's why focus(add_breaks) would not make sense (it always computes breaks, but has nothing to do with plotting them)

geom_break() adds breaks at the ends of truncated sequences. On a plot without any truncated sequences, it would plot nothing. It could, however, be shortened to geom_seq(add_breaks=TRUE) to automatically add breaks to every truncated sequence that is drawn. The drawback of that approach, it would not be possible to further manipulate the breaks - change the icon, size, color, .... But it would be faster. So maybe it might make sense to have both options - geom_seq(add_breaks=TRUE) for default breaks and geom_break() for customized breaks.

gggenomes also uses a custom theme (theme_gggenomes). no_x_axis() is just a wrapper around functions to manipulate the theme. It would also be possible to create a theme_gggenomes_no_x_axis(). Alternatively, one can also just use theme_void() to remove everything.

no_x_axis <- function (){
  theme(axis.line.x = element_blank(), axis.title.x = element_blank(), axis.text.x = element_blank(), axis.ticks.x = element_blank())
}

Suppressing the axis could be made part of geom_scale_bar(remove_x_axis=TRUE or so, to automatically suppress the axis if the scalebar is used. However, I feel like removing the axis explicitly makes it more transparent.

thackl avatar Sep 21 '21 09:09 thackl

Ah I see, now makes sense to me to have geom_break(). Thanks again for taking your time to explain it.

Regarding the scale bar, my two cents:

... + 
   theme_gggenomes_scalebar()

?

iferres avatar Sep 21 '21 13:09 iferres

Thank you for taking the time to give feedback! Really appreciated. The theme option sounds good! I'll try to add this to the next release.

thackl avatar Sep 21 '21 18:09 thackl

I assume the following feature request is not trivial at all, but have you considered ... + coord_polar() to draw circularized contigs? Playing with the package (and diving into the source code) I found that the following kinda works:

library(gggenomes) 

s0 <- tibble(
  gene_id = letters[1:6],
  bin_id = c("A", "A", "B", "B", "B", "B"),
  seq_id = factor(c("A1", "A1", "B1", "B1", "B2", "B2"), levels = c("A1", "B2", "B1")), # set factor to order contigs
  feat_id = c("a1","a2","b3", "b4", "b1", "b2"),
  start = c(1, 20, 1, 50, 1, 20),
  end = c(10, 30, 40, 70, 10, 30),
  strand = c(1, 1, 1, 1, 1, 1),
  length = c(1000, 1000, 1000, 1000, 1000, 1000)
)

gggenomes(s0) + 
  geom_seq() + 
  gggenomes:::geom_gene2() + 
  coord_polar() # + 
  # facet_wrap(~bin_id)

but I guess is experimental and there's a lot to work with to make it stable and user friendly, isn't it?

iferres avatar Oct 08 '21 14:10 iferres

I've opened this as a separate issue so I can easier keep track

thackl avatar Oct 08 '21 20:10 thackl