pollen
pollen copied to clipboard
Implement Chop
Chop works! After cargo build --release
, try something like fgfa -I ../tests/k.gfa chop -c 3 -l
. -c 3
specifies that nodes are to be chopping into segments no longer than 3, and -l
specifies that the output file should compute new links
(at this time, it's still not clear to me what need we have for links, if any, but it would be easy to make computing links the default behavior or to always compute links). (Side note, slow_odgi
does not compute links - do we care to change this?)
The basic algorithm for chop
is as follows:
seg_map; // map from old segments to their new, chopped counterparts
for each segment:
chop into segments of size c or smaller
if args.l:
link the new segments together, from head to tail (i.e., in the forward orientation)
update seg_map
for each path:
new_path;
for each step in path:
for new_seg in seg_map(step.seg):
append new_seg to our new_path
add new_path to new_fgfa
if args.l:
for link (A -> B) in old_fgfa:
add a new link from
(A.forward ? (A.end, forward) : (A.begin, backwards))
-> (B.forward ? (B.begin, forward) : (B.end ? backwards))
One weird note here: the implementation of chop
is split between cmd.rs
and main.rs
. The brunt of the work is done in cmd.rs
, but the logic for which aspects of our original graph to preserve is in main.rs
. It's unclear that a nice fix exists; because our new graph is borrowing elements from a GFAStore
created by chop
in cmd.rs
, ownership of the GFAStore
must be passed to the main
function in order for our new FlatGFA
to be valid. The best fix may be to compute the FlatGFA
in chop
and return both the FlatGFA
and GFAStore
, but right now we do not.