bcftools icon indicating copy to clipboard operation
bcftools copied to clipboard

bcftools sort problem

Open jaurbanChicago opened this issue 2 years ago • 2 comments

Hello,

I am using bcftools 1.16 and I am trying to use bcftools sort to sort the vcf chromosomes numerically instead of lexicologically. I am using the following command:

bcftools sort dbWGS.112022.reheader.vcf.gz -Oz -o dbWGS.112022.reheader.sort.vcf.gz

and I am getting the following errors:

Writing to /tmp/bcftools.1qAkOf
[W::vcf_parse_info] INFO 'GT1168' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'BaseQRankSum' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'ClippingRankSum' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'ExcessHet' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'FS' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'InbreedingCoeff' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'MQ' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'MQRankSum' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'NEGATIVE_TRAIN_SITE' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'POSITIVE_TRAIN_SITE' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'QD' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'ReadPosRankSum' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'SOR' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'VQSLOD' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'culprit' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'VQSRMODE' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'NS' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'AA_chimp' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'AA_ensembl' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'AA' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'VT' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'DP' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'AF' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'MLEAC' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'MLEAF' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'ExcHet' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'EAS_AF' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'AMR_AF' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'AFR_AF' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'EUR_AF' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'SAS_AF' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'RAF' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'INFO' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'AN' is not defined in the header, assuming Type=String
[W::vcf_parse_info] INFO 'AC' is not defined in the header, assuming Type=String
Error encountered while parsing the input at 1:51479
Cleaning

Is there any way to fix this issue? Thanks a lot in advance!

jaurbanChicago avatar Feb 23 '23 03:02 jaurbanChicago

The only working solution for now is to fix the header to include all the missing tags (bcftools reheader) or drop the undefined tags (bcftools annotate -x).

This is because the program internally converts the file to BCF and that requires all tags to be defined.

pd3 avatar Feb 23 '23 08:02 pd3

A possible enhancement would be to sort VCFs natively, without the need for BCF conversion. This was planned but never high enough on the priority list. Marking it as an enhancement for future reference.

pd3 avatar Feb 28 '23 13:02 pd3