vg icon indicating copy to clipboard operation
vg copied to clipboard

`duplicate rank` error in `vg find` when extracting a graph subregion

Open AndreaGuarracino opened this issue 4 years ago • 3 comments

1. What were you trying to do? I was trying to extract a subregion from a graph encoded in a GFA format file.

2. What did you want to happen? I want to obtain a GFA format file with the graph subregion plus the paths' portions covering that subregion.

3. What actually happened? I get this error error[load_proto_to_graph]: duplicate rank 5944 in path hg38_chr2

4. If you got a line like Stack trace path: /somewhere/on/your/computer/stacktrace.txt, please copy-paste the contents of that file here: -

5. What data and command can the vg dev team use to make the problem happen?

input=hppy1+chm13+h38-chr2.fa.gz.pggb-s4000-l12000-p98-n6-a0-K16-k29-w180000-j10000-e10000-I0.5-R0.2.smooth.renamed.gfa
path_ref=hg38_chr2

vg convert --gfa-in $input -x > $input.xg
vg find -p ${path_ref}:91597913-96188605 -x $input.xg > $input.${path_ref}_91597913_96188605.vg
vg view $input.${path_ref}_91597913_96188605.vg > $input.${path_ref}_91597913_96188605.gfa

Here the input used.

6. What does running vg version say? vg: variation graph tool, version v1.30.0 "Carentino"

AndreaGuarracino avatar Feb 10 '21 09:02 AndreaGuarracino

I'd suggest 2 things:

  • specifying a context size with -c to pull in nodes adjacent to your path. it defaults to 0 in vg find which probably won't give what you want
  • using vg chunk instead of vg find. (the relevant options remain the same). vg chunk is more careful to not make path fragments that will cause trouble downstream

glennhickey avatar Feb 10 '21 14:02 glennhickey

Thank you @glennhickey! Using -c with vg find didn't work, but using vg chunk plus -c I was able to get the graph's chunk.

Does it mean that vg find is going to be deprecated?

AndreaGuarracino avatar Feb 12 '21 13:02 AndreaGuarracino

Is there any update on this issue?

I'm having the same problem (except that I'm using -N instead of -p for the selection) and unfortunately I cannot use vg chunk as I'm trying to extract a list of nodes (with context) and vg chunk will extract each node into a separate chunk (I want a single graph without duplication and containing the paths).

heringerp avatar Nov 11 '25 09:11 heringerp