vg
vg copied to clipboard
VG convert to gfa and back will drop path data.
1. What were you trying to do? Convert an xg graph to gfa, make a small edit, and then convert to pg.
2. What did you want to happen? I wanted a pg graph with all the paths included.
3. What actually happened? Any time there was a path with a common root but a different start position, only one of the paths were kept. Example of paths (from vg paths -Lx) with the same common root:
hg002#2#JAHKSD010000042.1#0
hg002#2#JAHKSD010000042.1#29382422
hg002#2#JAHKSD010000042.1#52728260
You can tell that this happened because I got this message during conversion from gfa -> pg:
rrounthw@mustard:/private/groups/patenlab/rrounthw/nygc/chr19/sim-hg002-reads/chr19-hg002-graph$ vg convert -p chr19-hg002.added-pound-sign-to-ref.gfa > [chr19-hg002.added-pound-sign-to-ref.pg](http://chr19-hg002.added-pound-sign-to-ref.pg/)
warning:[GFAParser] Skipping GFA W line: GFA format error: On pass 1: On line 424042: Duplicate path hg002#2#JAHKSD010000042.1#0 exists in graph
warning:[GFAParser] Skipping GFA W line: GFA format error: On pass 1: On line 424043: Duplicate path hg002#2#JAHKSD010000042.1#0 exists in graph
And the output of vg paths -Lx
shows only one path of each root:
hg002#2#JAHKSD010000042.1#0
5. What data and command can the vg dev team use to make the problem happen? The original graph I used is at this position in mustard:
/private/groups/patenlab/rrounthw/nygc/chr19/sim-hg002-reads/chr19-hg002-graph/chr19-hg002.xg
The following commands should produce this issue:
vg convert -f chr19-hg002.xg > chr19-hg002.gfa
vg convert -p chr19-hg002.gfa > [chr19-hg002.pg](http://chr19-hg002.pg/)
6. What does running vg version
say?
vg version v1.53.0-270-ga84af9ff7 "Valmontone"
Compiled with g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 on Linux
Linked against libstd++ 20230528
Built by rrounthw@mustard