gatk icon indicating copy to clipboard operation
gatk copied to clipboard

VariantAnnotator giving error "No overlapping contigs found"

Open khanshahan opened this issue 11 months ago • 1 comments

Bug Report

Affected tool(s) or class(es)

VariantAnnotator

Affected version(s)

GATK-4.6.1.0

Description

I am having problem running the VariantAnnotator. I am assuming that the command can run with just vcf file as an input file. Please correct me if I am wrong here, and how it should be used otherwise? Is there a way that I can only use VCF to be annotated? Please see the Steps to reproduce for the exact command that I used.

Steps to reproduce

Using Ubuntu Linux. Get the dbsnp file from https://ftp.ncbi.nih.gov/snp/latest_release/VCF/GCF_000001405.25.gz Create Index file using command gatk IndexFeatureFile --input GCF_000001405.25.gz Get the GRCH37 based vcf from https://www.dropbox.com/scl/fi/mzek8urjqeb3sohe8p2j7/consensus.vcf?rlkey=s0ejvujwa4xswjq6oq9l5n3mu&st=zb03s12o&dl=0

Command: gatk VariantAnnotator -V consensus.vcf -dbsnp GCF_000001405.25.gz --output output.vcf

Expected behavior

output.vcf file should have enteries along with rsid from refsnp

Actual behavior

I get the following error (only part of the log is shown here). `22:30:01.480 INFO VariantAnnotator - Initializing engine 22:30:01.523 INFO FeatureManager - Using codec VCFCodec to read file file:///home/ngadmin/development/variant_caller_test/GCF_000001405.25.gz 22:30:01.578 INFO FeatureManager - Using codec VCFCodec to read file file:///home/ngadmin/development/variant_caller_test/consensus.vcf 22:30:01.582 WARN IndexUtils - Feature file "file:///home/ngadmin/development/variant_caller_test/GCF_000001405.25.gz" appears to contain no sequence dictionary. Attempting to retrieve a sequence dictionary from the associated index file 22:30:01.652 INFO VariantAnnotator - Shutting down engine [January 3, 2025 at 10:30:01 PM CET] org.broadinstitute.hellbender.tools.walkers.annotator.VariantAnnotator done. Elapsed time: 0.00 minutes. Runtime.totalMemory()=629145600


A USER ERROR has occurred: Input files best available and features have incompatible contigs: No overlapping contigs found. best available contigs = [chr1, chr2, chr3, chr4, chr5, chr6, chr7, chrX, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr20, chrY, chr19, chr22, chr21, chr6_ssto_hap7, chr6_mcf_hap5, chr6_cox_hap2, chr6_mann_hap4, chr6_apd_hap1, chr6_qbl_hap6, chr6_dbb_hap3, chr17_ctg5_hap1, chr4_ctg9_hap1, chr1_gl000192_random, chrUn_gl000225, chr4_gl000194_random, chr4_gl000193_random, chr9_gl000200_random, chrUn_gl000222, chrUn_gl000212, chr7_gl000195_random, chrUn_gl000223, chrUn_gl000224, chrUn_gl000219, chr17_gl000205_random, chrUn_gl000215, chrUn_gl000216, chrUn_gl000217, chr9_gl000199_random, chrUn_gl000211, chrUn_gl000213, chrUn_gl000220, chrUn_gl000218, chr19_gl000209_random, chrUn_gl000221, chrUn_gl000214, chrUn_gl000228, chrUn_gl000227, chr1_gl000191_random, chr19_gl000208_random, chr9_gl000198_random, chr17_gl000204_random, chrUn_gl000233, chrUn_gl000237, chrUn_gl000230, chrUn_gl000242, chrUn_gl000243, chrUn_gl000241, chrUn_gl000236, chrUn_gl000240, chr17_gl000206_random, chrUn_gl000232, chrUn_gl000234, chr11_gl000202_random, chrUn_gl000238, chrUn_gl000244, chrUn_gl000248, chr8_gl000196_random, chrUn_gl000249, chrUn_gl000246, chr17_gl000203_random, chr8_gl000197_random, chrUn_gl000245, chrUn_gl000247, chr9_gl000201_random, chrUn_gl000235, chrUn_gl000239, chr21_gl000210_random, chrUn_gl000231, chrUn_gl000229, chrM, chrUn_gl000226, chr18_gl000207_random] features contigs = [NC_000001.10, NC_000002.11, NC_000003.11, NC_000004.11, NC_000005.9, NC_000006.11, NC_000007.13, NC_000008.10, NC_000009.11, NC_000010.10, NC_000011.9, NC_000012.11, NC_000013.10, NC_000014.8, NC_000015.9, NC_000016.9, NC_000017.10, NC_000018.9, NC_000019.9, NC_000020.10, NC_000021.8, NC_000022.10, NC_000023.10, NC_000024.9, NC_012920.1, NT_113878.1, NT_113885.1, NT_113888.1, NT_113889.1, NT_113891.2, NT_113901.1, NT_113907.1, NT_113909.1, NT_113911.1, NT_113914.1, NT_113915.1, NT_113916.2, NT_113921.2, NT_113923.1, NT_113930.1, NT_113941.1, NT_113943.1, NT_113945.1, NT_113947.1, NT_113948.1, NT_113949.1, NT_113950.2, NT_113961.1, NT_167207.1, NT_167208.1, NT_167209.1, NT_167210.1, NT_167211.1, NT_167212.1, NT_167213.1, NT_167214.1, NT_167215.1, NT_167216.1, NT_167217.1, NT_167218.1, NT_167219.1, NT_167220.1, NT_167221.1, NT_167222.1, NT_167223.1, NT_167224.1, NT_167225.1, NT_167226.1, NT_167227.1, NT_167228.1, NT_167229.1, NT_167230.1, NT_167231.1, NT_167232.1, NT_167233.1, NT_167234.1, NT_167235.1, NT_167236.1, NT_167237.1, NT_167238.1, NT_167239.1, NT_167240.1, NT_167241.1, NT_167242.1, `

khanshahan avatar Jan 03 '25 21:01 khanshahan

Hi @khanshahan You need to check to see if your input and feature files are compatible. If you are working with hg38 we would recommend you to use the one in the link below

https://ftp.ncbi.nih.gov/snp/organisms/human_9606_b151_GRCh38p7/VCF/GATK/All_20180418.vcf.gz

For the hg19 version checkout the below one. https://ftp.ncbi.nih.gov/snp/organisms/human_9606_b151_GRCh37p13/VCF/GATK/All_20180423.vcf.gz

Regards.

gokalpcelik avatar Jan 06 '25 15:01 gokalpcelik