gatk
gatk copied to clipboard
Feature Request - extract Funcotator VCF INFO 'sub-fields'
A user on the GATK Forum submitted a request to make the INFO field easier to manipulate through creating a table. At the GATK Office Hours meeting 11/8, we discussed the two ideas and favored the first idea to make a new tool, similar to VariantsToTable, that would unpack the INFO field.
This request was created from a contribution made by Shahryar Alavi on October 30, 2020 19:54 UTC.
--
But MAF output is somewhat different from VCF; and I think the VCF output format is better for germline variant annotation.
With Funcotator we get an integrated (and minuter) "variant calling - annotation" workflow. But the problem is "vertical bar" separated INFOs are not easy for downstream text processing.
I have two suggestions for the GATK Team:
You may want to develop a new tool (like VariantsToTable) to separate each "sub-info" in the FUNCOTATION INFO, and put them into separate columns with corresponding headers when creating the tab-delimited table.
Or add a feature to Funcotator to create multiple INFOs with FUNCOTATION prefix in their IDs; e.g.
#INFO=<ID=FUNCOTATION\_Gencode\_34\_hugoSymbol,...>
#INFO=<ID=FUNCOTATION\_Gencode\_34\_ncbiBuild,...>
instead of
#INFO=<ID=FUNCOTATION,...,Description="Funcotation fields are: Gencode\_34\_hugoSymbol|Gencode\_34\_ncbiBuild|...">
Thanks
(created from Zendesk ticket #45403)
gz#45403