hts-specs icon indicating copy to clipboard operation
hts-specs copied to clipboard

VCF FILTER COLUMN "." value - to PASS or not to PASS, that is the question:

Open nh13 opened this issue 3 years ago • 5 comments

When the FILTER column in a VCF ".", is that a PASS variant or not or something else?

Motivation: I was using pysam's VariantRecord and noticed that when the VCF has "." in the FILTER column, that we have "PASS" in rec.filter is True, but "PASS" in set(rec.filter) is False. This seemed like odd behavior, but then this brought on an existential crisis about what does the "." mean in a VCF. Does it mean: a. no explicit determination has been made yet (i.e. nothing can be said about the variant passing or failing) b. the variant is implicitly PASSED (eg. the value PASS says explicitly passed, while "." is implicit) c. the variant is not yet PASSED (since the PASS value is not set) d. something else?

nh13 avatar Apr 28 '21 00:04 nh13

The VCF specification does not say, the interpretation is up to the user (i.e. the VCF creator). The most sensible interpretation is to treat . as every other missing value in VCF, which corresponds to the first case from your list.

pd3 avatar Apr 28 '21 11:04 pd3

Re-opened to suggest that @pd3 explicitly state that a “.” value should not be interpreted one way or the other.

nh13 avatar Apr 29 '21 02:04 nh13

My interpretation is also a) since if filters were applied, then the spec provides PASS as a special value to indicate as such (but it does not explicitly require this).

A VCF can have multiple layers of filtering that can occur at different times (e.g. GRIDSS does raw call filtering, and provides a seperate utility to perform somatic filtering on the raw VCF) so it's not clear which filters have PASSed. I guess one could implicitly infer this from the filter metadata headers but the specs don't provide any guideance on this.

Making the claim that if a filter metadata header exists in the VCF, then the filter has been applied to all variants thus restricting the usage of . to VCFs without any ##FILTER headers seems like something suitable for 'strict' VCF.

d-cameron avatar Apr 29 '21 03:04 d-cameron

@nh13 that's correct, the missing value "." should not be interpreted one way or another.

pd3 avatar Apr 30 '21 06:04 pd3

This was discussed at yesterday's meeting, and there was agreement that there is room for improvement here. In particular, §1.6.1/7 says “if [circumstances] then FILTER must be .” but does not specify anything in the other direction.

Please leave this open so that this can be tracked.

jmarshall avatar Apr 30 '21 06:04 jmarshall