vg icon indicating copy to clipboard operation
vg copied to clipboard

vg autoindex crashed trying to index a graph with oversized snarls

Open faithokamoto opened this issue 2 months ago • 15 comments

1. What were you trying to do?

Use vg autoindex --workflow lr-giraffe on a GBZ that I'd previously made a distance index for. My creation started from a GFA:

GRAPH=/private/groups/patenlab/fokamoto/centrolign/graph/unsampled/chr12

# Convert GFA to GBZ
vg convert --gfa-in $GRAPH.gfa | vg mod --chop 1024 - > $GRAPH.pg
vg gbwt --index-paths -x $GRAPH.pg -o $GRAPH.gbwt
vg gbwt --gbz-format -x $GRAPH.pg $GRAPH.gbwt -g $GRAPH.giraffe.gbz

# Index GBZ for haplotype sampling
vg gbwt -r $GRAPH.ri -Z $GRAPH.giraffe.gbz
vg index -t 64 -w 1 -w 2 -j $GRAPH.dist $GRAPH.giraffe.gbz
vg haplotypes -v 3 -t 16 -H $GRAPH.hapl -d $GRAPH.dist -r $GRAPH.ri $GRAPH.giraffe.gbz

# Index GBZ for read alignment
vg autoindex --gbz $GRAPH.giraffe.gbz -w lr-giraffe --prefix $GRAPH

The input GFA was the result of Python surgery; I took two GFAs and combined them by adding a dummy source/sink (nodes 1 and 2) that connected to the start/end of every path. Node IDs were properly handled etc. I'm 90% sure that the surgery worked correctly, since it did manage to become a GBZ and all, but if you want to check the surgery code is in /private/groups/patenlab/fokamoto/centrolign/code/add_dummy_caps.py.

Notably, in the distance index creation step, I got a warning about the index having oversized snarls. I suspect that's because the graph basically consists of that one giant centromere and it has a bunch of chains due to the aforementioned surgery method. Is this supposed to work with oversized snarls? Am I going to have to increase --snarl-limit like it suggested?

2. What did you want to happen?

Zipcode/minimizer files to be created.

3. What actually happened?

A bunch of fun errors, with this interesting line about is_regular_snarl() tossed in:

[vg autoindex] Guessing that /private/groups/patenlab/fokamoto/centrolign/graph/unsampled/chr12.dist is Giraffe Distance In
dex
[IndexRegistry]: Constructing minimizer index and associated zipcodes.
        use parameters -k 31 -w 50 -W payload type Standard
terminate called recursively
terminate called after throwing an instance of 'terminate called recursively
terminate called recursively
terminate called recursively
━━━━━━━━━━━━━━terminate called recursively
terminate called recursively
terminate called recursively
std::runtime_error'
terminate called recursively
  what():  error: is_regular_snarl requires a graph if the distance index doesn't contain distances
━terminate called recursively
terminate called recursively

The log goes on but it's nothing useful. It ends with

v1.69.0 "Bologna"
Caught signal 0xb raised at address 0x56420bf78a38; tracing with backward-cpp
━━━Stack trace (most recent call last)━━
━#0x14   Object ", in #━━━
 raised at address 0x56420bf78a38; tracing with backward-cpp
━━━0x16f92f━━━━━━━━━━━━━━
Crash report for vg v1.69.0 "Bologna"
Caught signal 0xb raised at address ━━━━━━━━━━━0x25   Object "━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Crash report for vg v1.69.0 "Bologna"
Caught signal 0xb━0x56420bf78a38 in thread ; tracing with backward-cpp
Stack trace (most recent call last)━Stack trace (most recent call last)━━━━━━━━", at 0, in
#0    Object "", at 0, in  raised at address 0x56420bf78a38; tracing with backward-cpp
━━━━━━━━━Crash report for vg :
#0x14   Object "", at ━Caught signal  in thread Crash report for vg ━━Crash report for vg Stack trace (most recent call last) in thread 0x16f92c:
#0x14   Object "", at
━━
#━━━━ in thread ━v1.69.0 "Bologna"━━━━━
Crash report for vg ━━", at 0xffffffffffffffffSegmentation fault (core dumped)

4. If you got a line like Stack trace path: /somewhere/on/your/computer/stacktrace.txt, please copy-paste the contents of that file here:

Nope. I guess I could rerun the autoindex and redirect the log output, if needed, but as I said there doesn't seem to be anything useful in there.

5. What data and command can the vg dev team use to make the problem happen?

See above command - all files on Phoenix cluster.

6. What does running vg version say?

vg version v1.69.0 "Bologna"
Compiled with g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 on Linux
Linked against libstd++ 20230528
Using HTSlib headers 101990, library 1.19.1-29-g3cfe8769
Built by fokamoto@mustard

I'm using the installation in /private/home/fokamoto/normal_vg

faithokamoto avatar Oct 24 '25 00:10 faithokamoto

I tried passing the graph to is_regular_snarl() but the whole thing still errored. Ended up instead just remaking the distance index to avoid oversized snarls. Would be nice if autoindex could make this work, though. Or at least warn you.

faithokamoto avatar Oct 27 '25 20:10 faithokamoto

Huh, I got the exact same error when trying to autoindex a distance index that I'm sure doesn't have oversized snarls. Weird. [edit: that was wrong, it did have oversized snarls]

faithokamoto avatar Oct 28 '25 03:10 faithokamoto

I've also encountered this issue. The only fix I found was to reduce the size/diversity of the graph to reduce snarl size.

cwatt avatar Oct 30 '25 19:10 cwatt

I'm currently trying to regenerate my indexes to avoid oversized snarls by using vg index --snarl-limit during distance indexing - did that fix the problem for you?

faithokamoto avatar Oct 30 '25 20:10 faithokamoto

I did try that but it didn't end up working for me, hopefully it does for you!

cwatt avatar Oct 30 '25 20:10 cwatt

Since when did you start having this issue?

faithokamoto avatar Oct 30 '25 22:10 faithokamoto

Well, vg autoindex is no longer crashing. It's also, uh, not finishing either. Just hanging and chowing down on the login node's resources. Gonna give it a few more days to see if it will ever complete.

faithokamoto avatar Nov 04 '25 19:11 faithokamoto

Hi all. I can report that, as of 1.69.0, I also see this problem on nearly every instance of indexing a personalized pangenome based on the v2 HPRC graph.

I believe the problem occurs because ZipCode::fill_in_zipcode_from_pos needs to be modified to pass around a graph pointer so that this call to SnarlDistanceIndex::is_regular_snarl doesn't fail whenever a graph contains oversized snarls that aren't stored explicitly in the distance index (because of this condition).

I think it's probably a separate concern that the personalized pangenomes are still ending up with oversized snarls. IIRC, it can be impractically slow to map with the resulting distance indexes. Is that still the case?

jeizenga avatar Nov 18 '25 18:11 jeizenga

Update: According to vg stats the maximum snarl size in some of the failing personalized graphs is 26, which I assume would be small enough for the distance index to represent explicitly, right? Maybe there's a bug that's causing normal-sized snarls to be labeled as oversize?

That said, it's still probably good to prevent ZipCode::fill_in_zipcode_from_pos from crashing on distance indexes that contain oversized snarls.

jeizenga avatar Nov 18 '25 21:11 jeizenga

The default size limit is 50k, so yeah, that shouldn't be leading to an oversized snarl. You could try using v1.70.0, which has an update to tell you the size of the largest snarl it saw if said snarl triggered oversized snarl behavior.

From what I remember, I tried to pass the GBZ through that call, but vg autoindex still failed and I ended up going around the problem in another way. I can look at it again if you need me to?

faithokamoto avatar Nov 18 '25 21:11 faithokamoto

I tried using 1.70.0 and got the same crash. The autoindex command didn't report any oversized snarls when it was constructing the distance index, which might bolster my theory that some snarls are being improperly marked as oversized when in fact they are normal size.

Do you remember what the workaround you came up with was?

jeizenga avatar Nov 19 '25 23:11 jeizenga

The workaround was to not index the graph, unfortunately. I used haplotype sampling to get a smaller graph and then aligned to that.

faithokamoto avatar Nov 19 '25 23:11 faithokamoto

If you can pass me one of the haplotype-sampled graphs that is having issues (either link it on Phoenix or email me -- it's in my Slack) then I can poke it.

faithokamoto avatar Nov 19 '25 23:11 faithokamoto

It turned out that I was using the wrong GBZ as input to autoindex on account of a pipelining error, so sorry for the false alarm about mysteriously oversized snarls. I'm no longer blocked on this, but FWIW I still think it's not great that autoindex just crashes with an uninformative error whenever it's passed a GBZ with oversized snarls.

jeizenga avatar Nov 20 '25 20:11 jeizenga

Did vg autoindex used to work with oversized snarls? Anyhow, I can still look in to fixing this error, but I'm going to have to rustle up a different test case I guess.

faithokamoto avatar Nov 20 '25 20:11 faithokamoto