htslib
htslib copied to clipboard
HTSlib should fail on trailing INFO garbage
The version I use is 1.11
The command I ran is bcftools view -R <target_region>.tsv -Oz -o <output_path>.vcf.gz <input_path>.vcf.gz
The vcf file is from simulation data, the golden vcf file. And the input vcf file looks like this:
The output vcf file looks like this:
Be aware of the part marked by the red circle. The end of the row is automatically sliced out, the trailing slash and last digit. Pls take a look at this issue and let me know how can I resolve this. Thx!
This is partly a problem with your VCF, partly with HTSlib:
-
the header says the WP field is an integer with Number=A values. If such, the values in the body should be comma-separated, not slash separated. Also there is wrong number of values.
-
however, the library should fail or at least print a warning about the broken INFO record.
Thx for the response! In this case, how should I modify the format of my VCF file to make this right?
btw, no warning messages are given by the bcftools view
I don't know what is the intention, but probably it would be best to redefine the tag in the header as Type=String
. That way it will stay preserved.