BandageNG icon indicating copy to clipboard operation
BandageNG copied to clipboard

Connected components?

Open tnn111 opened this issue 3 years ago • 10 comments

Hi,

Is there a way of getting BandageNG to output connected components as separate assembly graphs? I work with metagenomes and it would be extremely useful to be able to do this and then work on the individual graphs.

Thanks, Torben

tnn111 avatar Aug 25 '22 04:08 tnn111

Are you sure you're having separate components? This is very unusual. Usually metagenomes are just one huge component connected via conservative elements such as ribosomal genes and other that underwent HGT.

Can you post the graph info information here?

asl avatar Aug 25 '22 07:08 asl

Hi Anton,

I’m working with large environmental metagenomes and while they do contain some hairballs, it’s mostly a collection of components. The graph alone is around 500 MB. I’m attaching a screen shot from BandageNG. What I’d like to be able to do is to extract each of the connected components so that I can work on them individually.

Thanks, Torben

On Aug 25, 2022, at 00:12, Anton Korobeynikov @.***> wrote:

Are you sure you're having separate components? This is very unusual. Usually metagenomes are just one huge component connected via conservative elements such as ribosomal genes and other that underwent HGT.

Can you post the graph info information here?

— Reply to this email directly, view it on GitHub https://github.com/asl/BandageNG/issues/115#issuecomment-1226866497, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABMXPRUHZMN63MQY3BVHKE3V24MFLANCNFSM57RTW2SQ. You are receiving this because you authored the thread.

tnn111 avatar Aug 25 '22 17:08 tnn111

Sorry, no screenshot. Likely you'd need to do this via GitHub website, not email

asl avatar Aug 25 '22 17:08 asl

Trying again :-)

PastedGraphic-1

tnn111 avatar Aug 25 '22 17:08 tnn111

Just did. Thanks!

On Aug 25, 2022, at 10:02, Anton Korobeynikov @.***> wrote:

Sorry, no screenshot. Likely you'd need to do this via GitHub website, not email

— Reply to this email directly, view it on GitHub https://github.com/asl/BandageNG/issues/115#issuecomment-1227535886, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABMXPRS65CNSCAOT3EJY6FTV26RJVANCNFSM57RTW2SQ. You are receiving this because you authored the thread.

tnn111 avatar Aug 25 '22 17:08 tnn111

Oh, ok. Looks like a very fragmented one with low complexity. For now I'd just select a node from component and draw on, say, distance 500 around the node.

asl avatar Aug 25 '22 20:08 asl

@tnn111 Will it work if we'd add another possible scope, namely "component containing node(s)"?

asl avatar Sep 06 '22 07:09 asl

Hi Anton,

Yes, that would work. The reason for it is to be able to focus on what’s commonly a much smaller subgraph.

I needed the functionality so I wrote a trivial piece of code to parse a GFA file and extract based on the subgraph containing a given segment or a given path. It works well enough and I’m experimenting with adding additional functionality as I go.

Thanks, Torben

On Sep 6, 2022, at 00:21, Anton Korobeynikov @.***> wrote:

@tnn111 https://github.com/tnn111 Will it work if we'd add another possible scope, namely "component containing node(s)"?

— Reply to this email directly, view it on GitHub https://github.com/asl/BandageNG/issues/115#issuecomment-1237756038, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABMXPRWSTMEROBVP7JZTYC3V43WHHANCNFSM57RTW2SQ. You are receiving this because you were mentioned.

tnn111 avatar Sep 07 '22 04:09 tnn111

Well, SPAdes has gfa-split tool in the package, that splits GFA into connected components preserving PATHS, etc. :)

Also, you can certainly reuse GFA parser from BandageNG, it's pretty self-contained (and was made such for the purpose of easy reuse).

asl avatar Sep 07 '22 08:09 asl

If you're willing to build this repository it also has the same functionality in the build/split_connected_components executable: https://github.com/rlorigro/GFAse/blob/main/src/executable/split_connected_components.cpp

This repository also has a subgraph extraction executable, but it will not preserve a path in the output: https://github.com/vgteam/GetBlunted/blob/master/src/executable/extract_subgraph.cpp

rlorigro avatar Sep 13 '22 17:09 rlorigro