non unique labels can cause issues
Not sure whether this is a 'bug' but it has caused me issues. A FASTG file with multiple nodes with the same id causes Bandage to silently pick its favorites for inclusion (the last one read?). This means you can't just 'cat' multiple graph files together without renaming the nodes. One solution could be to warn the user if there are non-unique ids. Another would be to generate unique internal ids based on the unique combinations of length and coverage. Thanks
In this example, Bandage only loads the 26933 edge with length 155 but it is linked to 32917
EDGE_26933_length_1108_cov_2.35066:EDGE_26933_length_1108_cov_2.35066; EDGE_26933_length_1108_cov_2.35066':EDGE_26933_length_1108_cov_2.35066',EDGE_32917_length_1362_cov_1.4664'; EDGE_32917_length_1362_cov_1.4664:EDGE_26933_length_1108_cov_2.35066; EDGE_26933_length_155_cov_6.89286:EDGE_26933_length_155_cov_6.89286; EDGE_26933_length_155_cov_6.89286':EDGE_26933_length_155_cov_6.89286';
I agree, the current silent failure is bad! I did a quick fix and pushed it up to the development branch (https://github.com/rrwick/Bandage/commit/2db0b0cd52bbfd0b42d921167157c594fac08942) which makes Bandage refuse to load such graphs.
More robust solutions would be nice, however. I'll leave this issue open and hopefully get around to making Bandage at least give a nice informative error message, like 'That file has duplicate node names - I can't load it'. It'd be even better if it could handle duplicate names, but that would be a bit trickier.