svaba icon indicating copy to clipboard operation
svaba copied to clipboard

Question about SvABA parameter, "--chunk-size"

Open wangxlab opened this issue 4 years ago • 1 comments

Hi,

I read your explanation about "--chunk-size" in Github issue, "Parameters for max reads and chunk size #68" You explained "Max chunk also just changes the size of the anchor window." (The old name of the parameter is "max_chunk", right?) And, the parameter definition says "Size of a local assembly window (in bp). Set 0 for whole-BAM in one assembly." I like to know how this --chunk-size will affect the detection of structural variants, such as exon duplication. Could you give some advice whether I should set --chunk-size 0 or big number like 30000 ?

I read the parameter definition for '--read-tracking' and '--error-rate'

--read-tracking Track supporting reads by qname. Increases file sizes. [off] -e, --error-rate Fractional difference two reads can have to overlap. See SGA. 0 is fast, but requires error correcting. [0]

but hard to understand the meaning and can't decide what value I should use to detect exon duplication. I tested by setting -error-rate 0.5 and 0.7 but SvABA stopped in the middle. Could you give some advice?

Thank you, Sanghoon

wangxlab avatar May 05 '20 23:05 wangxlab

I got answers from the developer, Jeremiah Wala, via email. I attached the answer below for other people. I appreciate Jeremiah's kind explanation. ##################################################

  • chunk size : i wouldn't touch this parameter. Unless you BAM file is like 100 Mb, you don't want to set it to 0. It's a memory management issue, and shouldn't affect sensitivity
  • read-tracking - this just makes the output files much bigger by saying the names of the reads associated with each variant. It doesn't affect performance.
  • error-rate: this is another parameter that you probably should not change. The default is to do error-correction during the assembly, which is the right approach.

Overall, you should be able to use svaba to detect exon duplication and deletion in whole-genome sequencing. It won't work in whole-exome though, since you won't be able to see the breakpoints.

wangxlab avatar May 11 '20 18:05 wangxlab