vg
vg copied to clipboard
Crash after running vg rna
Hi
I'm running vg version 1.34.
I'm attempting to create a splice graph.
I have a GTF file where the first column represents the paths in the graph across all genomes. I generated the GTF using Stringtie given some RNA-seq results as a sorted BAM file. I produced a genome graph using PGGB and I'm attempting to embed transcript regions into the seqwish graph before I continue with normalisation again in PGGB using smoothxg.
I've been getting the following error:
+ srun -n 1 singularity exec --bind /data/pangenome_20way/results_rerun/test_vg_rna/1H:/data/pangenome_20way/results_rerun/test_vg_rna/1H /data/vg_builds/vg.sif vg rna -t 32 -n /data/pangenome_20way/results_rerun/test_vg_rna/1H/reference.gtf -p -e barley_pangenome_1H_s1000000_l0_p95_k316_B10000000_I0_R0_j100_e0_P1-4-6-2-26-1/barley_pangenome_1H.fasta.5afc036.7715ffd.seqwish.gfa.pg
[vg rna] Parsing graph file ...
[vg rna] Graph parsed in 1.51227 seconds, 2.14698 GB
[vg rna] Adding novel exon boundaries and splice-junctions to graph ...
[vg rna] 0 introns and 210668 transcripts parsed, and graph augmented in 188.199 seconds, 13.1727 GB
[vg rna] Topological sorting and compacting splice graph ...
[vg rna] Splice graph sorted and compacted in 29.7066 seconds, 13.1727 GB
[vg rna] Projecting haplotype-specfic transcripts ...
vg: src/transcriptome.cpp:1080: std::__cxx11::list<vg::EditedTranscriptPath> vg::Transcriptome::project_transcript_embedded(const vg::Transcript&, const bdsg::PositionOverlay&, bool) const: Assertion `border_offsets.first + 1 == _splice_graph->get_length(_splice_graph->get_handle_of_step(haplotype_path_start_step))' failed.
vg: src/transcriptome.cpp:1080: std::__cxx11::list<vg::EditedTranscriptPath> vg::Transcriptome::project_transcript_embedded(const vg::Transcript&, const bdsg::PositionOverlay&, bool) const: Assertion `border_offsets.first + 1 == _splice_graph->get_length(_splice_graph->get_handle_of_step(haplotype_path_start_step))' failed.
vg: src/transcriptome.cpp:1080: std::__cxx11::list<vg::EditedTranscriptPath> vg::Transcriptome::project_transcript_embedded(const vg::Transcript&, const bdsg::PositionOverlay&, bool) const: Assertion `border_offsets.first + 1 == _splice_graph->get_length(_splice_graph->get_handle_of_step(haplotype_path_start_step))' failed.
vg: src/transcriptome.cpp:1101: std::__cxx11::list<vg::EditedTranscriptPath> vg::Transcriptome::project_transcript_embedded(const vg::Transcript&, const bdsg::PositionOverlay&, bool) const: Assertion `haplotype_path_start_step != haplotype_path_end_step' failed.
vg: src/transcriptome.cpp:1080: std::__cxx11::list<vg::EditedTranscriptPath> vg::Transcriptome::project_transcript_embedded(const vg::Transcript&, const bdsg::PositionOverlay&, bool) const: Assertion `border_offsets.first + 1 == _splice_graph->get_length(_splice_graph->get_handle_of_step(haplotype_path_start_step))' failed.
vg: src/transcriptome.cpp:1080: std::__cxx11::list<vg::EditedTranscriptPath> vg::Transcriptome::project_transcript_embedded(const vg::Transcript&, const bdsg::PositionOverlay&, bool) const: Assertion `border_offsets.first + 1 == _splice_graph->get_length(_splice_graph->get_handle_of_step(haplotype_path_start_step))' failed.
vg: src/transcriptome.cpp:1081: std::__cxx11::list<vg::EditedTranscriptPath> vg::Transcriptome::project_transcript_embedded(const vg::Transcript&, const bdsg::PositionOverlay&, bool) const: Assertion `border_offsets.second == 0' failed.
vg: src/transcriptome.cpp:1081: std::__cxx11::list<vg::EditedTranscriptPath> vg::Transcriptome::project_transcript_embedded(const vg::Transcript&, const bdsg::PositionOverlay&, bool) const: Assertion `border_offsets.second == 0' failed.
vg: src/transcriptome.cpp:1080: std::__cxx11::list<vg::EditedTranscriptPath> vg::Transcriptome::project_transcript_embedded(const vg::Transcript&, const bdsg::PositionOverlay&, bool) const: Assertion `border_offsets.first + 1 == _splice_graph->get_length(_splice_graph->get_handle_of_step(haplotype_path_start_step))' failed.
vg: src/transcriptome.cpp:1080: std::__cxx11::list<vg::EditedTranscriptPath> vg::Transcriptome::project_transcript_embedded(const vg::Transcript&, const bdsg::PositionOverlay&, bool) const: Assertion `border_offsets.first + 1 == _splice_graph->get_length(_splice_graph->get_handle_of_step(haplotype_path_start_step))' failed.
ERROR: Signal 6 occurred. VG has crashed. Visit https://github.com/vgteam/vg/issues/new/choose to report a bug.
Stack trace path: /tmp/vg_crash_SltY5f/stacktrace.txt
Please include the stack trace file in your bug report!
ERROR: Signal 6 occurred. VG has crashed. Visit https://github.com/vgteam/vg/issues/new/choose to report a bug.
Stack trace path: /tmp/vg_crash_JWOJQg/stacktrace.txt
Please include the stack trace file in your bug report!
srun: error: node-12: task 0: Segmentation fault (core dumped)
The head of my GTF is:
HOR10350_v1_chr1H StringTie transcript 580313 581138 1000 + . gene_id "Horvu_HOR_10350_MSTRG.1"; transcript_id "Horvu_10350_1H01G002000.1"; ref_gene_id "Horvu_10350_1H01G002000";
HOR10350_v1_chr1H StringTie exon 580313 580490 1000 + . gene_id "Horvu_HOR_10350_MSTRG.1"; transcript_id "Horvu_10350_1H01G002000.1"; exon_number "1"; ref_gene_id "Horvu_10350_1H01G0020>
HOR10350_v1_chr1H StringTie exon 580906 581138 1000 + . gene_id "Horvu_HOR_10350_MSTRG.1"; transcript_id "Horvu_10350_1H01G002000.1"; exon_number "2"; ref_gene_id "Horvu_10350_1H01G0020>
HOR10350_v1_chr1H StringTie transcript 254863 259344 1000 - . gene_id "Horvu_HOR_10350_MSTRG.2"; transcript_id "Horvu_HOR_10350_MSTRG.2.1";
HOR10350_v1_chr1H StringTie exon 254863 255339 1000 - . gene_id "Horvu_HOR_10350_MSTRG.2"; transcript_id "Horvu_HOR_10350_MSTRG.2.1"; exon_number "1";
HOR10350_v1_chr1H StringTie exon 255448 255588 1000 - . gene_id "Horvu_HOR_10350_MSTRG.2"; transcript_id "Horvu_HOR_10350_MSTRG.2.1"; exon_number "2";
HOR10350_v1_chr1H StringTie exon 255685 255735 1000 - . gene_id "Horvu_HOR_10350_MSTRG.2"; transcript_id "Horvu_HOR_10350_MSTRG.2.1"; exon_number "3";
HOR10350_v1_chr1H StringTie exon 256222 256393 1000 - . gene_id "Horvu_HOR_10350_MSTRG.2"; transcript_id "Horvu_HOR_10350_MSTRG.2.1"; exon_number "4";
HOR10350_v1_chr1H StringTie exon 256477 256723 1000 - . gene_id "Horvu_HOR_10350_MSTRG.2"; transcript_id "Horvu_HOR_10350_MSTRG.2.1"; exon_number "5";
HOR10350_v1_chr1H StringTie exon 256811 257085 1000 - . gene_id "Horvu_HOR_10350_MSTRG.2"; transcript_id "Horvu_HOR_10350_MSTRG.2.1"; exon_number "6";
HOR10350_v1_chr1H StringTie exon 257248 257652 1000 - . gene_id "Horvu_HOR_10350_MSTRG.2"; transcript_id "Horvu_HOR_10350_MSTRG.2.1"; exon_number "7";
HOR10350_v1_chr1H StringTie exon 257755 258076 1000 - . gene_id "Horvu_HOR_10350_MSTRG.2"; transcript_id "Horvu_HOR_10350_MSTRG.2.1"; exon_number "8";
HOR10350_v1_chr1H StringTie exon 258161 258317 1000 - . gene_id "Horvu_HOR_10350_MSTRG.2"; transcript_id "Horvu_HOR_10350_MSTRG.2.1"; exon_number "9";
HOR10350_v1_chr1H StringTie exon 258447 258506 1000 - . gene_id "Horvu_HOR_10350_MSTRG.2"; transcript_id "Horvu_HOR_10350_MSTRG.2.1"; exon_number "10";
HOR10350_v1_chr1H StringTie exon 259131 259344 1000 - . gene_id "Horvu_HOR_10350_MSTRG.2"; transcript_id "Horvu_HOR_10350_MSTRG.2.1"; exon_number "11";
HOR10350_v1_chr1H StringTie transcript 340170 342653 1000 - . gene_id "Horvu_HOR_10350_MSTRG.3"; transcript_id "Horvu_10350_1H01G000700.1"; ref_gene_id "Horvu_10350_1H01G000700";
HOR10350_v1_chr1H StringTie exon 340170 340207 1000 - . gene_id "Horvu_HOR_10350_MSTRG.3"; transcript_id "Horvu_10350_1H01G000700.1"; exon_number "1"; ref_gene_id "Horvu_10350_1H01G0007>
HOR10350_v1_chr1H StringTie exon 341462 341765 1000 - . gene_id "Horvu_HOR_10350_MSTRG.3"; transcript_id "Horvu_10350_1H01G000700.1"; exon_number "2"; ref_gene_id "Horvu_10350_1H01G0007>
HOR10350_v1_chr1H StringTie exon 341774 341908 1000 - . gene_id "Horvu_HOR_10350_MSTRG.3"; transcript_id "Horvu_10350_1H01G000700.1"; exon_number "3"; ref_gene_id "Horvu_10350_1H01G0007>
HOR10350_v1_chr1H StringTie exon 341919 342151 1000 - . gene_id "Horvu_HOR_10350_MSTRG.3"; transcript_id "Horvu_10350_1H01G000700.1"; exon_number "4"; ref_gene_id "Horvu_10350_1H01G0007>
HOR10350_v1_chr1H StringTie exon 342432 342545 1000 - . gene_id "Horvu_HOR_10350_MSTRG.3"; transcript_id "Horvu_10350_1H01G000700.1"; exon_number "5"; ref_gene_id "Horvu_10350_1H01G0007>
HOR10350_v1_chr1H StringTie exon 342568 342653 1000 - . gene_id "Horvu_HOR_10350_MSTRG.3"; transcript_id "Horvu_10350_1H01G000700.1"; exon_number "6"; ref_gene_id "Horvu_10350_1H01G0007>
HOR10350_v1_chr1H StringTie transcript 352722 355685 1000 + . gene_id "Horvu_HOR_10350_MSTRG.4"; transcript_id "Horvu_10350_1H01G000800.1"; ref_gene_id "Horvu_10350_1H01G000800";
HOR10350_v1_chr1H StringTie exon 352722 355685 1000 + . gene_id "Horvu_HOR_10350_MSTRG.4"; transcript_id "Horvu_10350_1H01G000800.1"; exon_number "1"; ref_gene_id "Horvu_10350_1H01G0008>
HOR10350_v1_chr1H StringTie transcript 374900 378302 1000 - . gene_id "Horvu_HOR_10350_MSTRG.5"; transcript_id "Horvu_10350_1H01G000900.1"; ref_gene_id "Horvu_10350_1H01G000900";
HOR10350_v1_chr1H StringTie exon 374900 375186 1000 - . gene_id "Horvu_HOR_10350_MSTRG.5"; transcript_id "Horvu_10350_1H01G000900.1"; exon_number "1"; ref_gene_id "Horvu_10350_1H01G0009>
HOR10350_v1_chr1H StringTie exon 375278 376393 1000 - . gene_id "Horvu_HOR_10350_MSTRG.5"; transcript_id "Horvu_10350_1H01G000900.1"; exon_number "2"; ref_gene_id "Horvu_10350_1H01G0009>
The errors indicate a problem with position offset. Would this be referring to the exon coordinates whether they are base-0 or base-1?
Thank you for any help you can provide.
It seems to happen when projecting transcripts between paths in the graph, but I am not sure why it fails on these assertions. I do not think it is a problem with base-0 or 1 since it would assert earlier if that was the problem.
Would it be possible for you to share the data?
Hi @jonassibbesen
Sure, I could upload the seqwish graph and GTF file to you. How can I get the files to you? Thanks.
Are you able to share them using Dropbox, Google Drive or something similar? My email is [email protected]
Hi @jonassibbesen
Thanks for your help. I've put the data on Google drive and have emailed you.
Perfect, thank you!