Use GA4GH VRS identifier as Elasticsearch document ID for variants?
To avoid the possibility of indexing multiple copies of a document in Elasticsearch, each document must have a unique ID.
Currently, the document ID used for variants is a compressed form of the chrom-pos-ref-alt variant ID.
https://github.com/broadinstitute/gnomad-browser/blob/95f5d24f540f8132dc9cf546226ae9581bb095bb/data-pipeline/src/data_pipeline/pipelines/export_to_elasticsearch.py#L44-L45
https://github.com/broadinstitute/gnomad-browser/blob/95f5d24f540f8132dc9cf546226ae9581bb095bb/data-pipeline/src/data_pipeline/data_types/variant/variant_id.py#L68-L94
The variant ID itself cannot be used because ES document IDs are limited to 512 bytes.
https://www.elastic.co/guide/en/elasticsearch/reference/7.x/mapping-id-field.html
However, the technique currently used is not guaranteed to work with longer variant IDs from larger indels.
GA4GH VRS identifiers don't have this problem.
https://vrs.ga4gh.org/en/latest/impl-guide/computed_identifiers.html
Related to #658.