gtc2vcf
gtc2vcf copied to clipboard
Error Encountered while parsing the input
Hello -
Thank you for this tool for converting IDAT files into usable vcf files for analysis in plink. It's terrific!
I recently ran into an issue that I cannot seem to resolve on my own. When I attempt to create the bcf/vcf files from my gtc files, I receive the following error:
faidx_fetch_seqa failed at chrY:58862852 (are you using the correct reference genome?) Error encountered while parsing the input
I am using the same csv, bpm, and egt files that I used previously without error and my fasta reference file is GRCh38. My bcftools version is 1.16, which I believe should be a sufficient version for use with the gtc2vcf plugin. Before I remove and reinstall bcftools, I wanted to check with you to see whether there was a simpler fix for this problem.
Thank you. Chris
Chromosome Y in GRCh38 is 57,227,415 base pairs long and your manifest file has a coordinate at 58,862,852 which means that the coordinates in your manifest file are not for GRCh38. I don't know what array type and what version of the manifest file you are using, so I cannot guess why this is happening. However, the correct course of action here is to either find whether Illumina has GRCh38 manifest files for you array or simply use gtc2vcf's framework to realign the manifest file and then use the realigned coordinates when performing the conversion to VCF with gtc2vcf's option --sam-flank
. You should always use this approach unless you have first verified that your manifest file exclusively contains GRCh38 coordinates
Dear Giulio,
Thank you. This is very helpful information. Much appreciated.