TOBIAS icon indicating copy to clipboard operation
TOBIAS copied to clipboard

Read Shifting

Open soheilj opened this issue 2 years ago • 4 comments

Hi,

Thank you for your great tool. My reads (in the bam file) are already shifted. Is there a way to prevent ATACorrect from shifting them again? Can I just set the --read_shift parameter to "--read_shift 0 0"? Thank you.

soheilj avatar Jun 16 '22 04:06 soheilj

Hi @soheilj ,

Yes exactly, --read-shift 0 0 will prevent the shifting. After the run, you can also have a look in the first two plots of the output <prefix>_atacorrect.pdf file to ensure that the bias is similar for forward / reverse, as it should be mirrored around the insert site. That is a good test to see if the initial shift was right.

BR Mette

msbentsen avatar Jun 17 '22 09:06 msbentsen

Thank you very much, Mette!

soheilj avatar Jun 17 '22 20:06 soheilj

Hi Mette, I also had another question- I noticed you are using the following parameters for macs2 peak calling in the snakemake pipeline. --nomodel --shift -100 --extsize 200 --broad Is there a particular reason for using these parameters? I'm asking since macs2 returns narrow peaks by default and also for paired-end data it is recommended to use -f BAMPE instead of shifting and extending the reads. I'm using the following macs2 parameters and I was wondering if they would be compatible with TOBIAS design? --keep-dup all -q 0.01 -f BAMPE Thank you, Soheil.

soheilj avatar Jun 18 '22 05:06 soheilj

Hi,

Good question. For broad vs narrow peaks, there is no greater meaning, and I think narrow peaks work just as well. I believe I just chose --broad in order to potentially merge nearby peaks into greater regions (see also discussion here: https://github.com/loosolab/TOBIAS/issues/61).

For paired-end vs. shifting, I found this answer by the macs2 developer:

If you followed original protocol for ATAC-Seq, you should get Paired-End reads. If so, I would suggest you just use "--format BAMPE" to let MACS2 pileup the whole fragments in general. But if you want to focus on looking for where the 'cutting sites' are, then '--nomodel --shift -100 --extsize 200' should work. (https://github.com/macs3-project/MACS/issues/145#issuecomment-248369119)

So I chose the shifting to focus on getting the pileup of Tn5 inserts, rather than the full length of the fragments (which might span one or more nucleosomes). This is slightly different from e.g. ChIP-seq, where you would want to pileup the whole fragment.

.... but all that being said, I don't think it makes a huge difference. The main goal is to subset the full genome to the regions which are open, and are thus available for footprinting. So I think you can safely use both the BAMPE and shifting options, if the peaks look like they are capturing the open chromatin well.

BR Mette

msbentsen avatar Jun 20 '22 09:06 msbentsen