libbdsg icon indicating copy to clipboard operation
libbdsg copied to clipboard

How to extract most likely position of series of nodes

Open RenzoTale88 opened this issue 4 years ago • 5 comments

Hello, I've imported a pg graph in python using the bdsg module. I'm processing a series of alignments to the graph itself. For practical reasons I'm processing them as gaf alignments generated with vg convert -G. For each read, I've got a path represented in the >47102051>47102052>47102053 format. For these I can extract all the possible positions for each node on every path. However, this is quite impractical when it comes to defining the most likely contiguous set of intervals. Is there a way to extract this type of information based on this information? For example, if node 47102051 can come from "chr1:0-10" and "chr1:50-70", and node 47102052 comes from chr1:11-24, then the interval succession is likely to be: chr1:0-10 > chr1:11-24. Not sure if I'm explaining my problems clearly, but I hope it makes sense.

Thank you in advance, Andrea

RenzoTale88 avatar Mar 25 '21 16:03 RenzoTale88

I'm not sure if this is a libbdsg problem per se, but it sounds to me like what you're looking for is an alignment score. vg surject has an algorithm much like this, but I'm not sure if it will be very efficient on graphs with complex topologies.

jeizenga avatar Mar 25 '21 19:03 jeizenga

Thank you for the reply, yes I did try surject, but with a graph derived from cactus it kept failing around very large areas of the genome. Alternatively, is there a way to test if a path has a sequence of nodes as consecutive?

RenzoTale88 avatar Mar 26 '21 09:03 RenzoTale88

I think you could do what you're describing by using for_each_step_on_handle to get all of the path steps on the node and then using get_next_step or get_previous_step to walk the paths locally and check for the adjacent node.

jeizenga avatar Mar 26 '21 17:03 jeizenga

odgi position provides this interface. Not sure if that helps your case.

On Fri, Mar 26, 2021, 18:48 Jordan Eizenga @.***> wrote:

I think you could do what you're describing by using for_each_step_on_handle to get all of the path steps on the node and then using get_next_step or get_previous_step to walk the paths locally and check for the adjacent node.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/vgteam/libbdsg/issues/104#issuecomment-808407672, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQEPWC2NAAB753VLBO4LTFTCITANCNFSM4ZZWSVMA .

ekg avatar Mar 26 '21 19:03 ekg

@jeizenga ok thanks I can try implement that thanks.

@ekg yes I think it might be of help. Do you think it is possible to get this having a graph.og and a list of nodes' ids as above? (>47102051>47102052>47102053)

Thanks both for your help, I really appreciate!

RenzoTale88 avatar Mar 27 '21 15:03 RenzoTale88