pollen icon indicating copy to clipboard operation
pollen copied to clipboard

Implement Chop

Open susan-garry opened this issue 7 months ago • 0 comments

Chop works! After cargo build --release, try something like fgfa -I ../tests/k.gfa chop -c 3 -l. -c 3 specifies that nodes are to be chopping into segments no longer than 3, and -l specifies that the output file should compute new links (at this time, it's still not clear to me what need we have for links, if any, but it would be easy to make computing links the default behavior or to always compute links). (Side note, slow_odgi does not compute links - do we care to change this?)

The basic algorithm for chop is as follows:

seg_map;     // map from old segments to their new, chopped counterparts
for each segment:
    chop into segments of size c or smaller
    if args.l:
         link the new segments together, from head to tail (i.e., in the forward orientation)
    update seg_map

for each path:
    new_path;
    for each step in path:
        for new_seg in seg_map(step.seg):
              append new_seg to our new_path
    add new_path to new_fgfa

if args.l:
    for link (A -> B) in old_fgfa:
        add a new link from
             (A.forward ? (A.end, forward) : (A.begin, backwards))
                 -> (B.forward ? (B.begin, forward) : (B.end ? backwards))

One weird note here: the implementation of chop is split between cmd.rs and main.rs. The brunt of the work is done in cmd.rs, but the logic for which aspects of our original graph to preserve is in main.rs. It's unclear that a nice fix exists; because our new graph is borrowing elements from a GFAStore created by chop in cmd.rs, ownership of the GFAStore must be passed to the main function in order for our new FlatGFA to be valid. The best fix may be to compute the FlatGFA in chop and return both the FlatGFA and GFAStore, but right now we do not.

susan-garry avatar Jul 08 '24 20:07 susan-garry