gnomad-browser
gnomad-browser copied to clipboard
Explore gnomAD datasets on the web
ClinVar variants are annotated with gnomAD AC, etc. The ClinVar GRCh38 data pipeline currently uses gnomAD v3 for this. https://github.com/broadinstitute/gnomad-browser/blob/b4e38686e23fe4a31ff7c9541fe78b82b4df286b/data-pipeline/src/data_pipeline/pipelines/clinvar_grch38.py#L56-L65
Create a [data pipeline](https://github.com/broadinstitute/gnomad-browser/tree/main/data-pipeline/src/data_pipeline) for gnomAD v4 variants. Much of the [v3 pipeline](https://github.com/broadinstitute/gnomad-browser/blob/main/data-pipeline/src/data_pipeline/datasets/gnomad_v3/gnomad_v3_variants.py) can probably be reused, but exome data will need to be added and joined to genome data.
Check if a newer version of [MANE Select](https://www.ncbi.nlm.nih.gov/refseq/MANE/) transcripts are available and if so, update the version used in the genes data pipeline. https://github.com/broadinstitute/gnomad-browser/blob/b4e38686e23fe4a31ff7c9541fe78b82b4df286b/data-pipeline/src/data_pipeline/pipelines/genes.py#L82-L84
Update the version of GENCODE used in the genes data pipeline to match the version used in the version of VEP used to annotate v4. https://github.com/broadinstitute/gnomad-browser/blob/b4e38686e23fe4a31ff7c9541fe78b82b4df286b/data-pipeline/src/data_pipeline/pipelines/genes.py#L36-L38
WIP: https://github.com/broadinstitute/gnomad-browser/pull/890
It looks like our GKE cluster running gnomad won't auto-upgrade to the kubernetes 1.22 release, because some resources we have deployed are using deprecated APIs: Still have to track down...
We should pin the latest version of Hail used to successfully run data pipeline tasks. Could be in specified in [data-pipeline/requirements.txt](https://github.com/broadinstitute/gnomad-browser/blob/b497106d97773affd81b48eadfa5586259e011e5/data-pipeline/requirements.txt). However since some pipeline tasks are computationally intensive and...
Reload variant data after changes to data pipeline in #862, #863, #864, and #865.