gatk icon indicating copy to clipboard operation
gatk copied to clipboard

Allow for GT to be a nocall if GQ and PL[0] are zero instead of homref in GenomicsDB

Open nalinigans opened this issue 1 year ago • 8 comments

As a result of #8715, GenomicsDB supports GT to be nocall when GQ and PL[0] are zeroes. See https://github.com/GenomicsDB/GenomicsDB/pull/332.

nalinigans avatar Apr 02 '24 16:04 nalinigans

@nalinigans Thanks for this! What changed in the combined_with_genotypes.g.vcf.gz file?

droazen avatar Apr 02 '24 17:04 droazen

@nalinigans Thanks for this! What changed in the [combined_with_genotypes.g.vcf.gz (https://github.com/broadinstitute/gatk/pull/8759/files#diff-0c62eda00e61538d71e15979bb9a3cb28a7af64f36f94e040dea51cad5a000b0) file?

Some of the representative changes (second line has the no-call after genomicsdb changes):

orig : chr20 17959479 . T <NON_REF> . . . GT:GQ:MIN_DP:PL:DP 0/0:0:45:0,0,0:45 0/0:0:47:0,0,0:47 0/0:0:19:0,0,0:19 new: chr20 17959479 . T <NON_REF> . . . GT:GQ:MIN_DP:PL:DP ./.:0:45:0,0,0:45 ./.:0:47:0,0,0:47 ./.:0:19:0,0,0:19

orig: chr20 17960111 . G A,<NON_REF> . . BaseQRankSum=1.641;ClippingRankSum=1.506;ExcessHet=3.0103;MQRankSum=-0.247;RAW_MQ=107001.0;ReadPosRankSum=-0.202;DP=95 GT:AD:GQ:MIN_DP:PL:SB:DP 0/0:.:66:22:0,66,719,66,719,719:.:23 0/0:.:99:43:0,106,1474,106,1474,1474:.:48 0/1:18,11,0:99:.:261,0,434,315,467,781:9,9,6,5:29 new: chr20 17960111 . G A,<NON_REF> . . BaseQRankSum=1.641;ClippingRankSum=1.506;ExcessHet=3.0103;MQRankSum=-0.247;RAW_MQ=107001;ReadPosRankSum=-0.202;DP=95 GT:AD:GQ:MIN_DP:PL:SB:DP 0/0:.:66:22:0,66,719,66,719,719:.:23 0/0:.:99:43:0,106,1474,106,1474,1474:.:48 0/1:18,11,0:99:.:261,0,434,315,467,781:9,9,6,5:29

orig: chr20 17960223 . T G,<NON_REF> . . BaseQRankSum=1.294;ClippingRankSum=-0.102;ExcessHet=3.0103;MQRankSum=0.855;RAW_MQ=363600.0;ReadPosRankSum=-0.241;DP=101 GT:AD:GQ:PL:SB:DP 0/1:13,16,0:99:429,0,325,468,373,841:9,4,9,7:29 0/1:23,22,0:99:555,0,605,624,670,1294:12,11,8,14:45 1/1:0,27,0:81:847,81,0,847,81,847:0,0,13,14:27 new: chr20 17960223 . T G,<NON_REF> . . BaseQRankSum=1.294;ClippingRankSum=-0.102;ExcessHet=3.0103;MQRankSum=0.855;RAW_MQ=363600;ReadPosRankSum=-0.241;DP=101 GT:AD:GQ:PL:SB:DP 0/1:13,16,0:99:429,0,325,468,373,841:9,4,9,7:29 0/1:23,22,0:99:555,0,605,624,670,1294:12,11,8,14:45 1/1:0,27,0:81:847,81,0,847,81,847:0,0,13,14:27

orig: chr20 17960334 . G A,<NON_REF> . . BaseQRankSum=0.114;ClippingRankSum=0.089;ExcessHet=3.0103;MQRankSum=-1.843;RAW_MQ=154800.0;ReadPosRankSum=0.597;DP=94 GT:AD:GQ:MIN_DP:PL:SB:DP 0/0:.:90:32:0,90,1350,90,1350,1350:.:32 0/1:24,18,0:99:.:420,0,573,491,627,1118:12,12,8,10:42 0/0:.:54:19:0,54,810,54,810,810:.:19 new: chr20 17960334 . G A,<NON_REF> . . BaseQRankSum=0.114;ClippingRankSum=0.089;ExcessHet=3.0103;MQRankSum=-1.843;RAW_MQ=154800;ReadPosRankSum=0.597;DP=94 GT:AD:GQ:MIN_DP:PL:SB:DP 0/0:.:90:32:0,90,1350,90,1350,1350:.:32 0/1:24,18,0:99:.:420,0,573,491,627,1118:12,12,8,10:42 0/0:.:54:19:0,54,810,54,810,810:.:19

orig: chr20 17960349 . A <NON_REF> . . . GT:GQ:MIN_DP:PL:DP 0/0:90:31:0,90,1350:32 0/0:99:38:0,101,1210:42 0/0:0:21:0,0,522:21 new: chr20 17960349 . A <NON_REF> . . . GT:GQ:MIN_DP:PL:DP 0/0:90:31:0,90,1350:32 0/0:99:38:0,101,1210:42 ./.:0:21:0,0,522:21

orig: chr20 17960360 . T <NON_REF> . . . GT:GQ:MIN_DP:PL:DP 0/0:82:32:0,82,963:32 0/0:66:35:0,66,1008:35 0/0:0:18:0,0,416:18 new: chr20 17960360 . T <NON_REF> . . . GT:GQ:MIN_DP:PL:DP 0/0:82:32:0,82,963:32 0/0:66:35:0,66,1008:35 ./.:0:18:0,0,416:18

nalinigans avatar Apr 03 '24 03:04 nalinigans

@droazen, note that I have published only a 1.5.3 snapshot version of genomicsdb. If these changes are good, I will make a full release of 1.5.3 and reference that in build.gradle.

nalinigans avatar Apr 03 '24 04:04 nalinigans

@droazen, note that I have published only a 1.5.3 snapshot version of genomicsdb. If these changes are good, I will make a full release of 1.5.3 and reference that in build.gradle.

@droazen, build.gradle references 1.5.3 version now. Feel free to merge these changes. Thanks.

nalinigans avatar Apr 10 '24 15:04 nalinigans

Thanks @nalinigans ! Since https://github.com/broadinstitute/gatk/pull/8741 was just merged with some additional changes from @ldgauthier, could you please rebase onto the latest master branch and let tests run? I'm concerned there might be discrepancies in the test outputs between what's in this branch and what was merged in that PR that we might have to reconcile.

droazen avatar Apr 10 '24 17:04 droazen

@droazen, I had merged from master last week, wondering if this can be pulled in? Thanks.

nalinigans avatar Apr 22 '24 16:04 nalinigans

@nalinigans Yes, we just need to take a peek inside the VCF file you updated. Were there any additional changes to that VCF when you updated to the latest master?

droazen avatar Apr 22 '24 18:04 droazen

@nalinigans Yes, we just need to take a peek inside the VCF file you updated. Were there any additional changes to that VCF when you updated to the latest master?

@droazen I did not have to change the VCF when I merged with master.

nalinigans avatar Apr 23 '24 12:04 nalinigans