gnomad-browser
gnomad-browser copied to clipboard
Browse by RefSeq gene/transcript
Currently, gnomAD can only be browsed by Ensembl genes / transcripts. It should also support RefSeq genes / transcripts.
This only applies to gnomAD v3.1+. This requires VEP annotations with RefSeq transcripts, which were not included in prior versions of gnomAD.
- [ ] #727
- [x] #744
- [ ] #862
- [ ] #863
- [ ] #864
- [ ] #865
- [ ] #871
- [ ] #866
- [ ] #867
Looks like some RefSeq annotations are missing in VEP 101, which is used for annotations in gnomAD v3.1. https://github.com/Ensembl/ensembl-vep/issues/847
In gnomAD v3, variants in BRCA1 have only Ensembl annotations.
ds = hl.read_table("gs://gcp-public-data--gnomad/release/3.1.1/ht/genomes/gnomad.genomes.v3.1.1.sites.ht")
ds = hl.filter_intervals(ds, [hl.parse_locus_interval("chr17:43044295-43125364", reference_genome="GRCh38")])
ds.aggregate(hl.agg.explode(hl.agg.collect_as_set, ds.vep.transcript_consequences.map(lambda csq: csq.gene_id)))
# frozenset({'ENSG00000012048', 'ENSG00000198496', 'ENSG00000240828'})
As opposed to variants in PCSK9, which have both Ensembl and RefSeq annotations.
ds = hl.read_table("gs://gcp-public-data--gnomad/release/3.1.1/ht/genomes/gnomad.genomes.v3.1.1.sites.ht")
ds = hl.filter_intervals(ds, [hl.parse_locus_interval("chr1:55039447-55064852", reference_genome="GRCh38")])
ds.aggregate(hl.agg.explode(hl.agg.collect_as_set, ds.vep.transcript_consequences.map(lambda csq: csq.gene_id)))
# frozenset({'23358', '255738', 'ENSG00000162402', 'ENSG00000169174'})
This may have to wait until we update to a different version of VEP.
This can be worked on now that the correct RefSeq GTF file has been identified. We can choose to delay release until annotations are fixed or flag the affected genes in the browser.
Discussion from 2024-03-19 roadmapping meeting, two things were decided:
- annotating transcripts with
MANE Plus Clinical
would be valuable. Basically this would be a second asterisk added showing this annotation.
- getting all refseq transcripts would be nice but more difficult compared to 1)