Runtime error: Assertion `k == n_a' failed
I recently tested minigraph (0.5-r294-dirty) on the 12 YPRP yeast genomes and was impressed by the speed and small output file size. I'm now trying to create graphs using a set of five public maize genomes, starting with analysis of separate chromosomes. The maize chromosomes range in length from 150-330 Mbp and are estimated to be up to 85% transposable elements.
I'm using the same parameters (-x ggs -t 10) that I used for the yeast genomes, but my first three tests with chromosomes 1, 2, & 3 each failed with error message:
minigraph: lchain.c:226: mg_update_anchors: Assertion `k == n_a' failed.
It appears to me that the error occurs while mapping the first or second non-reference sequence:
chr01 log
[M::main::1.046*0.99] loaded the graph from "Zm_B73_chr01.fna" [M::mg_index::14.496*1.46] indexed the graph [M::mg_opt_update::15.174*1.44] occ_weight=20, occ_max1=625; 95 percentile: 5 [M::ggen_map::16.640*1.40] loaded file "Zm_DK105_chr01.fna" minigraph: lchain.c:226: mg_update_anchors: Assertion `k == n_a' failed.
chr02 log
[M::main::0.848*0.98] loaded the graph from "Zm_B73_chr02.fna" [M::mg_index::12.921*1.42] indexed the graph [M::mg_opt_update::13.435*1.40] occ_weight=20, occ_max1=553; 95 percentile: 5 [M::ggen_map::14.641*1.37] loaded file "Zm_DK105_chr02.fna" minigraph: lchain.c:226: mg_update_anchors: Assertion `k == n_a' failed.
chr03 log
[M::main::0.808*0.98] loaded the graph from "Zm_B73_chr03.fna" [M::mg_index::11.379*1.56] indexed the graph [M::mg_opt_update::11.884*1.53] occ_weight=20, occ_max1=535; 95 percentile: 5 [M::ggen_map::13.075*1.48] loaded file "Zm_DK105_chr03.fna" [M::ggen_map::80176.139*1.00] mapped 1 sequence(s) to the graph [M::mg_ggsimple::80176.781*1.00] inserted 1332 events [M::mg_index::80186.833*1.00] indexed the graph [M::mg_opt_update::80187.546*1.00] occ_weight=20, occ_max1=544; 95 percentile: 5 [M::ggen_map::80191.878*1.00] loaded file "Zm_EP1_chr03.fna" minigraph: lchain.c:226: mg_update_anchors: Assertion `k == n_a' failed.
Beyond this, it's not obvious to me from the error message what the exact problem is, so I'm not sure how best to modify the parameters. Any insights or suggestions would be appreciated!
Could you point me to Zm_B73_chr01.fna and Zm_DK105_chr01.fna for download? Thanks!
I've temporarily uploaded them to my personal Dropbox. Let me know if you need anything else.
Note: The files are annoyingly large, so I probably won't leave them in my Dropbox for long. In case anyone visits this post later, the sequences themselves are publicly available on NCBI. I simply separated them by chromosome and renamed them.
Thanks. I have downloaded the two files. You can delete them from your personal Dropbox. Maize genomes are more challenging than human. I will look into this issue, but it will take some time.
+1
It would be really nice, if you could take a look at this. I am curious about using minigraph with maize genomes, too.
Hello again, were you able to reproduce the error message?
Sorry for the late response. I can confirm this is a bug and can reproduce it. However, it seems that minigraph is too slow for the maize genome, potentially due to the rich repeats. I need to investigate the performance first. It will take time. I will leave the issue open. Thank you very much for reporting this.
Sorry to trouble you with it. Thanks for taking a look!