vawk icon indicating copy to clipboard operation
vawk copied to clipboard

VCF header missing = silent failure

Open mdshw5 opened this issue 10 years ago • 6 comments

If the VCF header is missing from the input data stream, vawk seems to silently fail when constructing an awk program.

mdshw5 avatar Jan 12 '15 21:01 mdshw5

Can you show me the command you're running? I get an error with the following:

$ curl -s ftp://platgene:[email protected]/IlluminaPlatinumGenomes_v7.0/merged_platinum/NA12878.vcf.gz | gzip -cdfq | grep -v "^#" | vawk '{ print S$NA12878 }'
awk: illegal field $(), name "NA12878"
 input record number 1
 source line number 1

cc2qe avatar Jan 12 '15 21:01 cc2qe

curl -s ftp://platgene:[email protected]/Illu 
inaPlatinumGenomes_v7.0/merged_platinum/NA12878.vcf.gz | gzip -cdfq | grep -v "
#" | vawk '{ print S$NA12878 }' | head
chr1    116549  .       C       T       .       SuspiciousHomAlt        MTD=bwa
_freebayes      GT      1|1
chr1    120458  .       T       C       .       SuspiciousHomAlt        MTD=bwa
_freebayes      GT      1|1
chr1    125271  .       C       T       .       SuspiciousHomAlt        MTD=bwa
_freebayes      GT      1|1
chr1    126113  .       C       A       .       PASS    MTD=isaac2,bwa_freebaye
s,bwa_platypus,bwa_gatk3        GT      1|1
chr1    128798  .       C       T       .       SuspiciousHomAlt        MTD=bwa
_freebayes      GT      1|1
chr1    129963  .       T       A       .       SuspiciousHomAlt        MTD=bwa
_freebayes      GT      1|1
chr1    139967  .       T       C       .       SuspiciousHomAlt        MTD=bwa
_platypus       GT      1|1
chr1    172595  .       G       A       .       SuspiciousHomAlt        MTD=bwa
_freebayes      GT      1|1
chr1    173173  .       A       G       .       SuspiciousHomAlt        MTD=bwa
_freebayes      GT      1|1
chr1    229673  .       A       C       .       SuspiciousHomAlt        MTD=bwa
_freebayes,bwa_platypus,bwa_gatk3       GT      1|1

mdshw5 avatar Jan 12 '15 22:01 mdshw5

That's with current master HEAD.

mdshw5 avatar Jan 12 '15 22:01 mdshw5

Another example that replicates my issue:

curl -s ftp://platgene:[email protected]/IlluminaPlatinumGenomes_v7.0/merged_platinum/NA12878.vcf.gz | gzip -cdfq | grep -v "^#" | vawk '{ if (S$NA12878$GT==S$NA12879$GT) print }' | head -n2
chr1    116549  .   C   T   .   SuspiciousHomAlt    MTD=bwa_freebayes   GT  1|1
chr1    120458  .   T   C   .   SuspiciousHomAlt    MTD=bwa_freebayes   GT  1|1

Shouldn't this fail?

mdshw5 avatar Jan 13 '15 01:01 mdshw5

And FYI, example1 works under linux, fails on my Mac. Example2 works on both linux and Mac.

mdshw5 avatar Jan 13 '15 01:01 mdshw5

Thanks for reporting these. It seems to result from awk versions behaving differently when they attempt to retrieve an key that is not in the dictionary. I'll put it on my todo list

cc2qe avatar Jan 13 '15 01:01 cc2qe