gatk icon indicating copy to clipboard operation
gatk copied to clipboard

Revisit "Jumbo"Annotations and find a more descriptive name

Open jamesemery opened this issue 2 years ago • 1 comments

Recently for Mutect a new class of JumboInfoFieldAnnotations and JumboGenotypeAnntations were introduced into GATK and their names are somewhat misleading and confusing on first pass. I would suggest renaming them to FragmentAnnotations or something that more accurately describes what they are using.

Furthermore given the state of the annotation engine and the type system nightmare that lurks beneath the surface it is quite difficult to use these annotations in any context except when the likelihoods have been computed in terms of fragments which can be a non-trivial conversion that shouldn't happen in every case. We should revisit the types for this whole class and find some way to make these annotations more usable outside of mutect.

jamesemery avatar Nov 03 '21 18:11 jamesemery

To add to this ticket. In #7876 we have had to expand the JumboAnnotations to work in the HaplotypeCaller as well. Unfortunately this has created problems since there aren't evidences objects in the HC so we have had to change the erasure of the annotate() methods somewhat and some hacky code is now part of the VariantAnnotatorEngine which currently has some code in the addInfoAnnotations() method that has to resolve the complicated spiderwebs of which likelihoods objects do or don't exist at any given time and then cast them to what they likely are. This really needs to be revisited and refactored to handle the extra annotation inputs more gracefully.

jamesemery avatar Jul 21 '22 18:07 jamesemery