MosaicForecast icon indicating copy to clipboard operation
MosaicForecast copied to clipboard

question from MuTect2-PoN_filter.py

Open mjko1210 opened this issue 2 years ago • 1 comments

Hi,

I have tumor only mutect2 run (with PoN) and I was looking for how to further filter possible germline variants. I found this code MuTect2-PoN_filter.py, and I was wondering if I can use this script for filtration.

Python MuTect2-PoN_filter.py test demo/test.Mutect2.vcf resources/SegDup_and_clustered.bed

The code works great! However, I have some questions.

  1. Is "SegDup_and_clustered.bed" (https://github.com/parklab/MosaicForecast/blob/master/resources/SegDup_and_clustered.GRCh38.bed) ready to use or do I need to create my own?

  2. From python code, Is there a reason why you have two separate conditions (applied slightly different threshold for following two conditions) ? Explanation will be greatly appreciated! https://github.com/parklab/MosaicForecast/blob/d4ef0bb3b006b2819a2d94b63ec7b13195989202/MuTect2-PoN_filter.py#L53-L58

I see all 0|1, 0/1 and 1|0 from mutect vcf file. I wonder if there is any biological reason behind about applying different threshold here. Thanks!

mjko1210 avatar Apr 18 '22 23:04 mjko1210

Hi,

I have tumor only mutect2 run (with PoN) and I was looking for how to further filter possible germline variants. I found this code MuTect2-PoN_filter.py, and I was wondering if I can use this script for filtration.

Python MuTect2-PoN_filter.py test demo/test.Mutect2.vcf resources/SegDup_and_clustered.bed

The code works great! However, I have some questions.

  1. Is "SegDup_and_clustered.bed" (https://github.com/parklab/MosaicForecast/blob/master/resources/SegDup_and_clustered.GRCh38.bed) ready to use or do I need to create my own?
  2. From python code, Is there a reason why you have two separate conditions (applied slightly different threshold for following two conditions) ? Explanation will be greatly appreciated! https://github.com/parklab/MosaicForecast/blob/d4ef0bb3b006b2819a2d94b63ec7b13195989202/MuTect2-PoN_filter.py#L53-L58

I see all 0|1, 0/1 and 1|0 from mutect vcf file. I wonder if there is any biological reason behind about applying different threshold here. Thanks!

Hi @mjko1210 ,

  1. Yes, the 38 file is ready to use.
  2. sorry that is a bug which has not been corrected in the parklab repository, there should be both "0|1" and "1|0" in the line. Could you refer to this file (https://github.com/douym/MosaicForecast/blob/master/MuTect2-PoN_filter.py)? it should be "if re.search(":0|1:", line) or re.search(":1|0:", line):"

douym avatar May 07 '22 03:05 douym