beacon-v2 icon indicating copy to clipboard operation
beacon-v2 copied to clipboard

Representing caseLevelData/zygosity with VRS alleles

Open daisieh opened this issue 1 year ago • 19 comments

If I'm creating a genomicVariant specification from a VCF variant record, I can't see how I'd specify multiple alleles in a single genomicVariant: the variation property seems to be singular? For example, a variant record might have a ref A and an alt C,T. Samples in that record might have genotypes that correspond to A/C, A/A, A/T, C/T.

LegacyVariation seems to be able to capture basic VCF-format ref/alt, at least in the case where there is only one alternate allele. It does not seem like there's an option for multiple alt alleles. So I could capture zygosity/genotype for A/A and A/T as caseLevelData corresponding to one Variation, and A/A and A/C as a different one (even there, how would I know which variation to put the A/A cases in?). But how would I represent C/T samples?

VRS's MolecularVariation seems to be the preferred schema moving forward, I assume. It seems like in this schema, there is no idea of a reference allele at all: each allele is represented by a single variation. But without an ability to specify multiple variations for a genomicVariant, how would I represent zygosity for caseLevelData?

daisieh avatar Feb 15 '23 20:02 daisieh