gnomad-browser icon indicating copy to clipboard operation
gnomad-browser copied to clipboard

Browse by RefSeq gene/transcript

Open nawatts opened this issue 3 years ago • 3 comments

Currently, gnomAD can only be browsed by Ensembl genes / transcripts. It should also support RefSeq genes / transcripts.

This only applies to gnomAD v3.1+. This requires VEP annotations with RefSeq transcripts, which were not included in prior versions of gnomAD.

  • [ ] #727
  • [x] #744
  • [ ] #862
  • [ ] #863
  • [ ] #864
  • [ ] #865
  • [ ] #871
  • [ ] #866
  • [ ] #867

nawatts avatar Mar 22 '21 16:03 nawatts

Looks like some RefSeq annotations are missing in VEP 101, which is used for annotations in gnomAD v3.1. https://github.com/Ensembl/ensembl-vep/issues/847

In gnomAD v3, variants in BRCA1 have only Ensembl annotations.

ds = hl.read_table("gs://gcp-public-data--gnomad/release/3.1.1/ht/genomes/gnomad.genomes.v3.1.1.sites.ht")
ds = hl.filter_intervals(ds, [hl.parse_locus_interval("chr17:43044295-43125364", reference_genome="GRCh38")])
ds.aggregate(hl.agg.explode(hl.agg.collect_as_set, ds.vep.transcript_consequences.map(lambda csq: csq.gene_id)))
# frozenset({'ENSG00000012048', 'ENSG00000198496', 'ENSG00000240828'})

As opposed to variants in PCSK9, which have both Ensembl and RefSeq annotations.

ds = hl.read_table("gs://gcp-public-data--gnomad/release/3.1.1/ht/genomes/gnomad.genomes.v3.1.1.sites.ht")
ds = hl.filter_intervals(ds, [hl.parse_locus_interval("chr1:55039447-55064852", reference_genome="GRCh38")])
ds.aggregate(hl.agg.explode(hl.agg.collect_as_set, ds.vep.transcript_consequences.map(lambda csq: csq.gene_id)))
# frozenset({'23358', '255738', 'ENSG00000162402', 'ENSG00000169174'})

This may have to wait until we update to a different version of VEP.

nawatts avatar Aug 18 '21 19:08 nawatts

This can be worked on now that the correct RefSeq GTF file has been identified. We can choose to delay release until annotations are fixed or flag the affected genes in the browser.

nawatts avatar Dec 14 '21 20:12 nawatts

Discussion from 2024-03-19 roadmapping meeting, two things were decided:

  1. annotating transcripts with MANE Plus Clinical would be valuable. Basically this would be a second asterisk added showing this annotation.

image

  1. getting all refseq transcripts would be nice but more difficult compared to 1)

mattsolo1 avatar Mar 20 '24 12:03 mattsolo1