graphtyper icon indicating copy to clipboard operation
graphtyper copied to clipboard

HLA-Genotyping: <error> Converting "HLA-A*01:01:01:01" of size 17 to int failed

Open Lukecassar21 opened this issue 1 month ago • 2 comments

Good morning, I wanted to ask for some guidance regarding an issue I've been running into while genotyping structural variants using graphtyper. I am attempting a test-run of graphtyper v2 on multiple samples with HLA contigs, the VCF file has 30 samples called by manta and merged using JasmineSV (similarly to svimmer it preserves the sv information from the original VCF file, and I was successful in using graphtyper's genotype_sv command to genotype structural variants from manta and smoove merged by JasmineSV).

The problem is that my true samples have HLA contigs which have structural variants which must be genotyped as well. When genotyping with the appropriate reference genome, regardless of if I use a region file or a specific HLA contig as a region for genotyping, I get the following error:

<error> Converting "HLA-A*01:01:01:01" of size 17 to int failed.

Due to this issue I decided to change the HLA contig names in the regions file, VCF file, and reference fasta (separate copies for each so as to no overwrite the original files) to use underscores instead of colons, i.e. "HLA-A*01_01_01_01", and for a short while this method seemingly worked as the command began to run without issue, until it attempted to genotype the HLA regions.

Command Used:

 ./graphtyper genotype_sv /homes/lcass09/sv_calling/output/graphtyper_output/Homo_sapiens_assembly38_HLA.fasta /homes/user/sv_calling/
output/graphtyper_output/jasmine_30_samples_HLA_strands_sorted.vcf.gz --sams=/homes/user/sv_calling/o
utput/SURVIVOR_merge_output/30_bam_samples.txt --region_file=/homes/user/sv_calling/output/graphtyper
_output/sorted_combined_chromosome_contig_list.txt --output=/homes/user/sv_calling/output/graphtyper_
output/30_Samples_Mantaonly      

Output observed after mutliple hours:


[W::tbx_parse1] VCF INFO/END=22916 is smaller than POS at chr1:137018                                   
This tag will be ignored. Note: only one invalid END tag will be reported.                              
[2024-05-15 20:39:25.780] <warning> [constructor.cpp:719] I do not know how to add an insertion at position 44044804                                                                                            
[2024-05-15 20:39:36.525] <warning> [constructor.cpp:719] I do not know how to add an insertion at position 44044804                                                                                            
[2024-05-16 09:56:59.814] <warning> [constructor.cpp:719] I do not know how to add an insertion at position 94051932                                                                                            
[2024-05-16 09:57:12.866] <warning> [constructor.cpp:719] I do not know how to add an insertion at position 94051932                                                                                            
[2024-05-16 11:33:48.384] <warning> [constructor.cpp:719] I do not know how to add an insertion at position 80930428                                                                                            
[2024-05-16 16:26:31.266] <warning> [constructor.cpp:719] I do not know how to add an insertion at position 32030496                                                                                            
[2024-05-16 16:26:42.076] <warning> [constructor.cpp:719] I do not know how to add an insertion at position 32030496                                                                                            
[2024-05-16 20:17:50.450] <error> hts_reader.cpp:113 Failed to query region 'HLA-A*01_01_01_01:1-203503'
[2024-05-16 20:17:50.467] <error> hts_reader.cpp:113 Failed to query region 'HLA-A*01_01_01_01:1-203503'
[2024-05-16 20:17:50.467] <error> hts_reader.cpp:113 Failed to query region 'HLA-A*01_01_01_01:1-203503'
[2024-05-16 20:17:50.467] <error> hts_reader.cpp:113 Failed to query region 'HLA-A*01_01_01_01:1-203503'
[2024-05-16 20:17:50.467] <error> hts_reader.cpp:113 Failed to query region 'HLA-A*01_01_01_01:1-203503'
[2024-05-16 20:17:50.467] <error> hts_reader.cpp:113 Failed to query region 'HLA-A*01_01_01_01:1-203503'
*** Error in './graphtyper': corrupted double-linked list: 0x0000000001ee1460 ***                       
Segmentation fault (core dumped)           

I'm not exactly sure why this is happening or how I can circumvent this, it's relatively important to keep the HLA contigs for my purposes so if there's any way to deal with this I would greatly appreciate any guidance or insight.

Thank you for your time and patience.

Lukecassar21 avatar May 20 '24 06:05 Lukecassar21