fgbio
fgbio copied to clipboard
FilterConsensusReads --min-reads 1 0 0 vs --min-reads 0 0 0; Why do they differ?
I ran this function with --min-reads 1 0 0 and --min-reads 0 0 0 and get slightly more reads in the second setting. I was just curious as to how these could give a different result?
@dstephensSD that's a bit surprising, since a consensus read should have depth at least 1 always. Can you show a read that is kept in the second setting.
You might try setting --max-no-call-fraction
to 1? I suspect there is some weird interaction going on where even having the min reads be 1 might be occasionally masking more bases in consensus reads, then causing the reads to fail the additional filter of how many Ns are in the read? As @nh13 says, a BAM file with 1-2 example reads that pass with 0 0 0
and get filtered out with 1 0 0
would be super helpful in diagnosing.
Sorry for the late reply. I'm unable to share any reads due to privacy protections. I ran FilterConsensusReads with a variety of settings and here are some more results:
If I can get some additional samples that aren't protected I will share some reads with you to help troubleshoot.
Thanks
@dstephensSD you could probably just replace all the bases with As, change the read names to READ<i>
, and then there's no information left that's a privacy concern. You could also set all mappings to chr1 position 1