pollen
pollen copied to clipboard
Proofs about our commands
What if we could prove that Pollen DSL programs preserve the semantics of variation graphs? That is, some but not all odgi commands are "logical no-ops," in the sense that they may split up segments and rename things and stuff, but the set of nucleotide sequences they represent (i.e.,
odgi flatten
would still produce the same flat sequences). Could we somehow prove that these programs are no-ops, so they're free of "bugs"?
Originally posted by @sampsyo in https://github.com/cucapra/pollen/pull/84#pullrequestreview-1417108141
I love this idea!
I'd love to be able to state and prove, for example:
forall g,
valid g ->
exists g', flip g g' /\ valid g' /\ meaning g == meaning g'
where
-
flip a b
is a relation, in this case a function, that says thatb
is the flipped version ofa
. i.e, all the good stuff stated here. -
meaning x
is some beautiful encapsulation of the semantics of a graphx
, rising above all the kludge of segment-splits and names.
Bit of progress on this, still in slow_odgi
for now.
I have done up paths_logically_le
. I take a path-forward approach and assert, in the relevant commands (just chop
and inject
for now) that the graph we pass is "less than or equal to" the graph we get back. That is, every path that I used to have in the input graph still exists, with the same name and sequence, in the new graph.
This issue talks about logical equality, but I've come up with this logical <=
approach because I think there's some value there. We can state ==
using <=
and antisymmetry.