vrs
vrs copied to clipboard
Categorical variation
Some variants are described in terms of exculsionary criteria. These will need to be considered in our model.
See https://civicdb.org/events/genes/5/summary/variants/2408/summary#variant for an example.
Would it be more accurate to say "some sets or groups of variants are defined, at least in part, in terms of exclusionary criteria"?
That said, the example is essentially "lack of a variant at a given position", which when stated in the positive, is "presence of reference sequence at a given position". If we can express the reference sequence as an Allele, then a set of variants could be defined using that as an inclusion criterion rather than the negative form as an exclusionary criterion.
wouldn't Non-V600 be all variants at V600 that are not the same as reference? if so, isn't that what those nasty ambiguity codes are for in the IUPAC list? I didn't look but I presume they have the ambiguity codes for all the amino acid residue combos, like they do for nucleotides. If not, then we would have to do something special to support this.
Likely it will get thown into the categorical variation bucket. (Maybe?)
I think there are two separable issues here.
-
I agree with @rrfreimuth that asserting reference is approximately the same thing as negating the existence of wildcard variation. Asserting reference is preferable.
-
@larrybabb: This issue is not about other AA at p.600. Instead, it's about variation at other locations in the context of a reference V600V (i.e., ref). For example, the statement we'd like is something like V600V and K601E.
So, IMO, this is just another flavor of co-occurring variation.
@rrfreimuth and @reece, you've got it. This variant is about variations occurring not at V600, effectively the notion of a non-reference presentation of the BRAF gene (in entirety). From the first evidence item description, it is clear that the additional condition of reference p.600 (V600V
) is included in the definition of this variant.
The challenge isn't with the co-occuring variation component (though I agree it's a component as we're asserting reference at p.600 and an altered state elsewhere). Instead, the challenge is how we describe (a) a "non-reference" / negative state for the full protein sequence, plus (b) a co-occurring reference state at V600.
This issue should start with how we resolve (a).
@larrybabb this the ambiguity codes only apply for nucleotides. Due to the alphabet size for amino acids, it is unfeasible to specify one-character ambiguity codes.
This issue was marked stale due to inactivity.
"Negative Variants" is opaque. Can we call this issue something else? How about "Non-specific variation"?
Okay. I gave it some thought, looked through the set of most common biomarkers in CIViC, and have decided that this issue is primarily about a form of categorical variation. I prefer describing these as aggregative or categorical vs non-specific, since the criteria are well-defined and exact. "Non-specific" variants can mean many things, including variants with fuzzy intervals, or insertions / deletions of approximate size and/or unknown sequence.
Above we discussed the V600V + non-V600 alteration scenario (position exclusionary), but I'd also like us to consider here the position-bound non-reference variants, such as BRAF V600, PIK3CA E545, and DNMT3A R882.
This issue was marked stale due to inactivity.
This issue was marked stale due to inactivity.
This issue was marked stale due to inactivity.
To be handled by the Cat-VRS