gggenomes
gggenomes copied to clipboard
Successive ribbon and track colors
Hi, Thanks for your amazing package. I am trying to describe an inversion scénario between 2 genomes. Here is my code:
# a minimal seq track
s0 <- tibble(
seq_id = c("M", "S1", "P"),
length = c(14011714, 16938810, 16938810)
)
# a minimal gene track
g0 <- tibble(
seq_id = c("M", "M", "M", "M", "M", "M",
"S1", "S1", "S1", "S1", "S1", "S1",
"P", "P", "P", "P", "P", "P"),
start = c(1, 1957564, 7345956, 7944221, 10852458, 12757211,
1, 1957564, 10253430, 10851695, 10852458, 12757211,
1, 7612881, 12493219, 15053107, 15072904, 15671252),
end = c(1830774, 6805033, 7898617, 10851694, 12679667, 14011714,
1830774, 6805033, 7345956, 10299033, 12679667, 14011714,
1832933, 1879467, 7830998, 12534142, 15670067, 16938810)
)
# a simple link track
l0 <- tibble(
seq_id = c("M", "M", "M", "M", "M"),
start = c(1, 1957564, 7345956, 10852458, 12757211),
end = c(1830774, 6805033, 10851694, 12679667, 14011714),
seq_id2 = c("S1", "S1", "S1", "S1", "S1"),
start2 = c(1, 1957564, 10851695, 10852458, 12757211),
end2 = c(1830774, 6805033, 7345956, 12679667, 14011714),
)
p <- gggenomes(genes=g0, seqs=s0, links=l0)
p +
geom_seq() + # draw contig/chromosome lines
geom_seq_label() + # label each sequence
geom_gene() + # draw genes as arrow
geom_link() # draw some connections between syntenic regions
Sorry for trivial questions but I am trying to understand how to :
-
Add a distinct color for each track and the corresponding ribbon.
-
The "M" is overlapping with a ribbon, how can I put it on the side instead of under the 1st track ?
-
How can I put ribbons between
S1
andP
? -
Put a legend for each track (and the corresponding colors)
Thanks for your help !
The "M" is overlapping with a ribbon, how can I put it on the side instead of under the 1st track?
Either use geom_bin_label() instead of geom_bin_seq()
p +
geom_seq() + # draw contig/chromosome lines
geom_bin_label() + # label each sequence
geom_gene() + # draw genes as arrow
geom_link() # draw some connections between syntenic regions
or geom_bin_seq() + geom_link(offset=c(0.3,0.15)) to make space
p +
geom_seq() + # draw contig/chromosome lines
geom_seq_label() + # label each sequence
geom_gene() + # draw genes as arrow
geom_link(offset=c(0.3,0.15)) # draw some connections between syntenic regions
How can I put ribbons between S1 and P?
just add start/end with S1/P as seq_id and seq_id2 (I did not adjust the numbers)
l1 <- bind_rows(l0, tibble(
seq_id = c("P", "P", "P", "P", "P"),
start = c(1, 1957564, 7345956, 10852458, 12757211),
end = c(1830774, 6805033, 10851694, 12679667, 14011714),
seq_id2 = c("S1", "S1", "S1", "S1", "S1"),
start2 = c(1, 1957564, 10851695, 10852458, 12757211),
end2 = c(1830774, 6805033, 7345956, 12679667, 14011714),
))
gggenomes(genes=g0, seqs=s0, links=l1) +
geom_seq() + # draw contig/chromosome lines
geom_bin_label() + # label each sequence
geom_gene() + # draw genes as arrow
geom_link() # draw some connections between syntenic regions
or if you just want to connect genes, you could do
g1 <- g0 %>% group_by(seq_id) %>%
mutate(feat_id = paste(seq_id, row_number(), sep = "_")) %>%
ungroup
c0 <- g1 %>% transmute(cluster_id = str_replace(feat_id, ".*_", "cls"), feat_id)
p2 <- gggenomes(genes=g1, seqs=s0) %>% add_clusters(c0)
p2 +
geom_seq() + # draw contig/chromosome lines
geom_bin_label() + # label each sequence
geom_gene() + # draw genes as arrow
geom_link() # draw some connections between syntenic regions
Add a distinct color for each track and the corresponding ribbon. Put a legend for each track (and the corresponding colors)
not 100% sure what you mean. In gggenomes "track" refers to a type of data, all "genes" are a track, all "links" are track. This could be done as below.
p2 +
geom_seq() + # draw contig/chromosome lines
geom_bin_label() + # label each sequence
geom_gene(aes(fill="gene")) + # draw genes as arrow
geom_link(aes(fill="link")) # draw some connections between syntenic regions
But have the feeling you want something else. Maybe color by genome? If not, could you elaborate?
p2 +
geom_seq() + # draw contig/chromosome lines
geom_bin_label() + # label each sequence
geom_gene(aes(fill=seq_id)) + # draw genes as arrow
geom_link() # draw some connections between syntenic regions
Thanks @thackl it's very helpful.
-
By color, I mean one color for each "gene" and each ribbon so in my example it would be 6 different colors with a legend associated. I have already pre-defined colors so if it's possible to set html color codes that would be great.
-
I was wondering also if the genome size at the bottom could be more precise ? so instead of having
0M 5M 10M 15M
, I would have a scale every 2M ?
# the easiest is probably to put genes in clusters
c1 <- tribble(
~cluster_id, ~feat_id,
"cls1", "M_1",
"cls1", "S1_1",
"cls1", "P_1",
"cls2", "M_2",
"cls2", "S1_4",
"cls2", "P_6"
# and so on
)
p3 <- gggenomes(genes=g1, seqs=s0) %>%
add_clusters(c1) +
geom_seq() + # draw contig/chromosome lines
geom_bin_label() + # label each sequence
geom_gene(aes(fill=cluster_id)) + # draw genes as arrow
geom_link() # draw some connections between syntenic regions
p3
# cluster_id is also appended to links
# use scale_*_manual for custom coloring
p3 +
geom_seq() + # draw contig/chromosome lines
geom_bin_label() + # label each sequence
geom_gene(aes(fill=cluster_id)) + # draw genes as arrow
geom_link(aes(fill=cluster_id, color=cluster_id)) + # draw some connections between syntenic regions
scale_fill_manual(values=c(cls1="#aaaaff", cls2="#ffaaaa"), na.value="grey70") +
scale_color_manual(values=c(cls1="#aaaaff", cls2="#ffaaaa"), na.value="grey70")
# the axis labels can be controlled via scale_x_bp
# Have a look at scale_x_continuous() for more options.
p3 +
geom_seq() + # draw contig/chromosome lines
geom_bin_label() + # label each sequence
geom_gene(aes(fill=cluster_id)) + # draw genes as arrow
geom_link(aes(fill=cluster_id, color=cluster_id)) + # draw some connections between syntenic regions
scale_fill_manual(values=c(cls1="#aaaaff", cls2="#ffaaaa"), na.value="grey70") +
scale_color_manual(values=c(cls1="#aaaaff", cls2="#ffaaaa"), na.value="grey70") +
scale_x_bp(breaks=seq(0, 18, 2)*1e6)