gnomad-browser icon indicating copy to clipboard operation
gnomad-browser copied to clipboard

Handle RefSeq transcript annotations in transcript version annotations

Open nawatts opened this issue 3 years ago • 0 comments

VEP annotations for Ensembl transcripts do not include version numbers. Thus, version numbers are annotated using data from the version of GENCODE used by VEP (from the output of the genes data pipeline).

https://github.com/broadinstitute/gnomad-browser/blob/b4e38686e23fe4a31ff7c9541fe78b82b4df286b/data-pipeline/src/data_pipeline/data_types/variant/transcript_consequence/annotate_transcript_consequences.py#L63-L72

However, annotations for RefSeq transcripts do include version numbers in the transcript ID field. The ID and version need to be split into different fields to match annotations for Ensembl transcripts.

nawatts avatar Jan 10 '22 19:01 nawatts