mehari icon indicating copy to clipboard operation
mehari copied to clipboard

Compare mehari annotation with VEP annotation

Open tedil opened this issue 1 year ago • 3 comments

  • build mehari transcript DB (ensembl):
    • download cdot for ensembl and grch37 / grch38
    • download ensembl FASTA for transcripts
    • create / fill seqrepo with ENSEMBL FASTAs
    • build mehari database
  • install VEP + caches
    • caches: grch37 release 105, grch38 release 110
  • obtain test dataset, e.g.:
    • clinvar VCF
      • https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh37/
      • https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh38/
    • genes with few variants: TGDS, KYNU
    • genes with lots variants: BRCA1, BRCA2, TTN
    • later: extend to include certain regions in gnomAD (via tabix https://)
      • regions: BRCA1, BRCA2, TTN, SLC39A14 (ManePlusClinical)
  • comparison:
    • VEP on ClinVar GRCh37/GRCh38
    • local check: build data for e.g. TGDS
    • annotate all transcripts with both mehari and vep

tedil avatar May 29 '24 13:05 tedil

I had started https://github.com/varfish-org/annotation-zoo a while ago, but that repo got stalled, since I was busy with DHA stuff. It would be great if we could put repro stuff there unless we want to move towards a more monorepo approach for mehari.

xiamaz avatar May 29 '24 14:05 xiamaz

I had started https://github.com/varfish-org/annotation-zoo a while ago, but that repo got stalled, since I was busy with DHA stuff. It would be great if we could put repro stuff there unless we want to move towards a more monorepo approach for mehari.

Let's continue in a central repo, but I'd rather have this called mehari-validation or similar?

holtgrewe avatar May 29 '24 14:05 holtgrewe

We do this in https://github.com/varfish-org/mehari-annotation-comparison but this potentially needs some tidying up and a README

tedil avatar Jan 03 '25 10:01 tedil