vg autoindex crashed trying to index a graph with oversized snarls
1. What were you trying to do?
Use vg autoindex --workflow lr-giraffe on a GBZ that I'd previously made a distance index for. My creation started from a GFA:
GRAPH=/private/groups/patenlab/fokamoto/centrolign/graph/unsampled/chr12
# Convert GFA to GBZ
vg convert --gfa-in $GRAPH.gfa | vg mod --chop 1024 - > $GRAPH.pg
vg gbwt --index-paths -x $GRAPH.pg -o $GRAPH.gbwt
vg gbwt --gbz-format -x $GRAPH.pg $GRAPH.gbwt -g $GRAPH.giraffe.gbz
# Index GBZ for haplotype sampling
vg gbwt -r $GRAPH.ri -Z $GRAPH.giraffe.gbz
vg index -t 64 -w 1 -w 2 -j $GRAPH.dist $GRAPH.giraffe.gbz
vg haplotypes -v 3 -t 16 -H $GRAPH.hapl -d $GRAPH.dist -r $GRAPH.ri $GRAPH.giraffe.gbz
# Index GBZ for read alignment
vg autoindex --gbz $GRAPH.giraffe.gbz -w lr-giraffe --prefix $GRAPH
The input GFA was the result of Python surgery; I took two GFAs and combined them by adding a dummy source/sink (nodes 1 and 2) that connected to the start/end of every path. Node IDs were properly handled etc. I'm 90% sure that the surgery worked correctly, since it did manage to become a GBZ and all, but if you want to check the surgery code is in /private/groups/patenlab/fokamoto/centrolign/code/add_dummy_caps.py.
Notably, in the distance index creation step, I got a warning about the index having oversized snarls. I suspect that's because the graph basically consists of that one giant centromere and it has a bunch of chains due to the aforementioned surgery method. Is this supposed to work with oversized snarls? Am I going to have to increase --snarl-limit like it suggested?
2. What did you want to happen?
Zipcode/minimizer files to be created.
3. What actually happened?
A bunch of fun errors, with this interesting line about is_regular_snarl() tossed in:
[vg autoindex] Guessing that /private/groups/patenlab/fokamoto/centrolign/graph/unsampled/chr12.dist is Giraffe Distance In
dex
[IndexRegistry]: Constructing minimizer index and associated zipcodes.
use parameters -k 31 -w 50 -W payload type Standard
terminate called recursively
terminate called after throwing an instance of 'terminate called recursively
terminate called recursively
terminate called recursively
━━━━━━━━━━━━━━terminate called recursively
terminate called recursively
terminate called recursively
std::runtime_error'
terminate called recursively
what(): error: is_regular_snarl requires a graph if the distance index doesn't contain distances
━terminate called recursively
terminate called recursively
The log goes on but it's nothing useful. It ends with
v1.69.0 "Bologna"
Caught signal 0xb raised at address 0x56420bf78a38; tracing with backward-cpp
━━━Stack trace (most recent call last)━━
━#0x14 Object ", in #━━━
raised at address 0x56420bf78a38; tracing with backward-cpp
━━━0x16f92f━━━━━━━━━━━━━━
Crash report for vg v1.69.0 "Bologna"
Caught signal 0xb raised at address ━━━━━━━━━━━0x25 Object "━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Crash report for vg v1.69.0 "Bologna"
Caught signal 0xb━0x56420bf78a38 in thread ; tracing with backward-cpp
Stack trace (most recent call last)━Stack trace (most recent call last)━━━━━━━━", at 0, in
#0 Object "", at 0, in raised at address 0x56420bf78a38; tracing with backward-cpp
━━━━━━━━━Crash report for vg :
#0x14 Object "", at ━Caught signal in thread Crash report for vg ━━Crash report for vg Stack trace (most recent call last) in thread 0x16f92c:
#0x14 Object "", at
━━
#━━━━ in thread ━v1.69.0 "Bologna"━━━━━
Crash report for vg ━━", at 0xffffffffffffffffSegmentation fault (core dumped)
4. If you got a line like Stack trace path: /somewhere/on/your/computer/stacktrace.txt, please copy-paste the contents of that file here:
Nope. I guess I could rerun the autoindex and redirect the log output, if needed, but as I said there doesn't seem to be anything useful in there.
5. What data and command can the vg dev team use to make the problem happen?
See above command - all files on Phoenix cluster.
6. What does running vg version say?
vg version v1.69.0 "Bologna"
Compiled with g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 on Linux
Linked against libstd++ 20230528
Using HTSlib headers 101990, library 1.19.1-29-g3cfe8769
Built by fokamoto@mustard
I'm using the installation in /private/home/fokamoto/normal_vg
I tried passing the graph to is_regular_snarl() but the whole thing still errored. Ended up instead just remaking the distance index to avoid oversized snarls. Would be nice if autoindex could make this work, though. Or at least warn you.
Huh, I got the exact same error when trying to autoindex a distance index that I'm sure doesn't have oversized snarls. Weird. [edit: that was wrong, it did have oversized snarls]
I've also encountered this issue. The only fix I found was to reduce the size/diversity of the graph to reduce snarl size.
I'm currently trying to regenerate my indexes to avoid oversized snarls by using vg index --snarl-limit during distance indexing - did that fix the problem for you?
I did try that but it didn't end up working for me, hopefully it does for you!
Since when did you start having this issue?
Well, vg autoindex is no longer crashing. It's also, uh, not finishing either. Just hanging and chowing down on the login node's resources. Gonna give it a few more days to see if it will ever complete.
Hi all. I can report that, as of 1.69.0, I also see this problem on nearly every instance of indexing a personalized pangenome based on the v2 HPRC graph.
I believe the problem occurs because ZipCode::fill_in_zipcode_from_pos needs to be modified to pass around a graph pointer so that this call to SnarlDistanceIndex::is_regular_snarl doesn't fail whenever a graph contains oversized snarls that aren't stored explicitly in the distance index (because of this condition).
I think it's probably a separate concern that the personalized pangenomes are still ending up with oversized snarls. IIRC, it can be impractically slow to map with the resulting distance indexes. Is that still the case?
Update: According to vg stats the maximum snarl size in some of the failing personalized graphs is 26, which I assume would be small enough for the distance index to represent explicitly, right? Maybe there's a bug that's causing normal-sized snarls to be labeled as oversize?
That said, it's still probably good to prevent ZipCode::fill_in_zipcode_from_pos from crashing on distance indexes that contain oversized snarls.
The default size limit is 50k, so yeah, that shouldn't be leading to an oversized snarl. You could try using v1.70.0, which has an update to tell you the size of the largest snarl it saw if said snarl triggered oversized snarl behavior.
From what I remember, I tried to pass the GBZ through that call, but vg autoindex still failed and I ended up going around the problem in another way. I can look at it again if you need me to?
I tried using 1.70.0 and got the same crash. The autoindex command didn't report any oversized snarls when it was constructing the distance index, which might bolster my theory that some snarls are being improperly marked as oversized when in fact they are normal size.
Do you remember what the workaround you came up with was?
The workaround was to not index the graph, unfortunately. I used haplotype sampling to get a smaller graph and then aligned to that.
If you can pass me one of the haplotype-sampled graphs that is having issues (either link it on Phoenix or email me -- it's in my Slack) then I can poke it.
It turned out that I was using the wrong GBZ as input to autoindex on account of a pipelining error, so sorry for the false alarm about mysteriously oversized snarls. I'm no longer blocked on this, but FWIW I still think it's not great that autoindex just crashes with an uninformative error whenever it's passed a GBZ with oversized snarls.
Did vg autoindex used to work with oversized snarls? Anyhow, I can still look in to fixing this error, but I'm going to have to rustle up a different test case I guess.