gggenes
gggenes copied to clipboard
Implement SBOL sequence feature glyphs
Implement the Synthetic Biology Open Standard (SBOL) sequence feature glyphs. The current rough plan for this is:
- Add a
geom_<glyph>()
function for each sequence feature glyph -
geom_gene_arrow()
becomes a wrapper forgeom_CDS()
and soft deprecated -
geom_subgene_arrow()
becomes a wrapper forgeom_polypeptide_region()
and soft deprecated -
geom_feature()
becomes a convenience geom that wraps all the glyph geoms and accepts atype
aesthetic. This allows a user to draw different types of sequence feature with a single layer, rather than having to add a new geom layer for each type, which could get tedious. This function will lack the flexibility of having different aesthetic mappings for different glyphs, as well as fine-tune control of glyph geometry, but it will still probably cover a large proportion of use cases. To maintain backward compatibility, if thetype
aesthetic is not mapped, it should draw promoters or locations with the currentgeom_feature()
interface, but this functionality will be soft-deprecated - Continue the pattern of using the
xmin
,xmax
andforward
aesthetics to control the direction of directional glyphs - Continue the pattern of
geom_gene_arrow()
/geom_gene_label()
by having a separategeom_<glyph>_label()
for each glyph (geom_CDS_label()
,geom_intron_label()
etc.). This is necessary to preserve the ggplot2 grammar, as a user might want to use different aesthetic mappings for glyphs and labels.geom_gene_label()
,geom_feature_label()
, andgeom_subgene_label()
would become wrappers and soft deprecated - Each of the
geom_<glyph>()
layers will also accept a label aesthetic which draws a text label for the feature with sensible defaults
I've opened a SBOL_glyphs branch to work on this, and added geom_aptamer()
and geom_aptamer_label()
as a starting point:
library(tidyverse)
library(gggenes)
aptamers <- data.frame(molecule = c("Genome1", "Genome1", "Genome1", "Genome2", "Genome2", "Genome2"), location = c(50, 71, 13, 8, 12, 91), name = paste0("Apt", 1:6))
ggplot(aptamers, aes(x = location, y = molecule, label = name)) +
geom_aptamer(inherit.aes = TRUE, height = grid::unit(10, "mm")) +
geom_aptamer_label(inherit.aes = TRUE, height = grid::unit(10, "mm"))
Created on 2023-07-11 with reprex v2.0.2
To get the coordinates for the aptamer glyph, I downloaded the SVG files for the glyphs from the latest SBOL release then converted them to grid-compatible coordinates with the svgparser package:
aptamer <- read_svg("~/Downloads/glyphs-svg/aptamer.svg", obj_type = "data.frame")
This has the pleasing benefit that paths expressed as Bézier curves in the SVG are automatically converted into a series of short line segments, which sidesteps the trouble of transforming Béziers into polar coordinates. I think this method of extracting the glyph coordinates from the SVG assets should be the rule, though no doubt there will be some exceptions where this is not be best choice.