wrangling-genomics
wrangling-genomics copied to clipboard
Explain filtering variants and show its effects with the TS/TV ratio.
This is a contribution for the instruction training checkout. In this PR I would like to suggest to stress the importance of the step 3 of the variant calling pipeline, where after variants have been called, they are filtered on the basis of several quality control criteria. To that end, this PR proposes to add the following text in this step:
The
vcfutils.pl varFilter
call filters out variants that do not meet minimum quality default criteria, which can be changed through its options. Usingbcftools
we can verify that the quality of the variant call set has improved after this filtering step by calculating the ratio of transitions(TS) to transversions (TV) ratio (TS/TV), where transitions should be more likely to occur than transversions:
$ bcftools stats results/bcf/SRR2584866_variants.vcf | grep TSTV
# TSTV, transitions/transversions:
# TSTV [2]id [3]ts [4]tv [5]ts/tv [6]ts (1st ALT) [7]tv (1st ALT) [8]ts/tv (1st ALT)
TSTV 0 628 58 10.83 628 58 10.83
$ bcftools stats results/vcf/SRR2584866_final_variants.vcf | grep TSTV
# TSTV, transitions/transversions:
# TSTV [2]id [3]ts [4]tv [5]ts/tv [6]ts (1st ALT) [7]tv (1st ALT) [8]ts/tv (1st ALT)
TSTV 0 621 54 11.50 621 54 11.50
In this lesson bcftools
are already being used for variant calling and therefore, the additional time required to introduce this idea should be little and should prompt the learner to realize that the quality of the variants can be very heterogeneous and filtering is an important step to obtain a more homogeneous variant call set.
Thank you for this contribution @rcastelo! This is a nicely worded explanation of the filtering option in bcftools
. As these lessons are already quite full, I would recommend not incorporating this into the main text of the lesson, but instead placing the explanation in a callout box using the following syntax:
Filtering
Text of the explanation here. Text of the explanation here. Text of the explanation here.
{: .callout}
@vlrieg - please let me know if you'd like me to update the PR based on this suggestion.
Thanks for taking a look @ErinBecker! It would be great if you could make these changes... I'm not sure I know how to insert the callout boxes yet!
@vlrieg - Done! Please go ahead and merge this if you're happy with it. Use "merge pull request" rather than "squash and merge" to ensure @rcastelo's original commit gets counted.
Thank you so much @ErinBecker and @vlrieg for considering this contribution and for your efforts to integrate it into the episode !! I look forward to seeing it live at the website. Best regards!