BALSAMIC icon indicating copy to clipboard operation
BALSAMIC copied to clipboard

Upload soft-filtered variants to Scout?

Open mathiasbio opened this issue 9 months ago • 3 comments

Is your feature request related to a problem? Please describe.

In a validation of the GMS Lymphoid panel there's been a few variants in the reference samples that were filtered out in the VCF due to presence of the variant in the normal. This is done by VarDict itself, setting the Germline filter and which is then filtered out by bcftools further downstream in the analysis.

scatter_plot_briefskink

Some of these variants in the reference sample had AF_N and AF_T at similar levels at around 0.1, another had Tumor af = 0.6359, Normal af = 0.1264.

In the email-discussion of these validation-results, there was some questions raised on how we set these germline-filters, and some concern regarding the risk of filtering out some real and interesting somatic variants due to Tumor In Normal Contamination, and other factors.

It was discussed if we should try to avoid actually filtering on the AFs, and just annotate and then let them filter themselves in Scout, or if we could just keep variants tagged as germline if they also had the ACMG status of Pathogenic or Likely-pathogenic.

I think this sounds pretty intriguing! Generally I don't think there's a lot of variants marked as Pathogenic or Likely-pathogenic so including them in the final VCF regardless of presence in the normal would be a nice way to avoid the risk of filtering out these variants.

I was thinking further if we could extend this idea to keep all these variants in the final VCF regardless of any filter. But in that case perhaps there would be too many artifacts included in Scout, but if it's possible to upload these variants with some soft-filters like "Poor qual / Presence in germline DB" I think it could be nice.

Describe the solution you'd like

To avoid modifying the bcftools filters too much, could we extract these type variants early on after annotation into a separate VCF which we don't filter, and then in the creation of the final clinical / research VCF, just merge these in?

Describe alternatives you've considered If possible, a clear and concise description of any alternative solutions or features you've considered.

Additional context If possible, add any other context or screenshots about the feature request here.

Expected output for the feature If possible, an example of expected output

Current BALSAMIC version balsamic --version 12.0.2

mathiasbio avatar Sep 18 '23 08:09 mathiasbio