HiC-Pro icon indicating copy to clipboard operation
HiC-Pro copied to clipboard

processing HiC-Pro using split_reads.py

Open yamzaleg opened this issue 4 years ago • 2 comments

Hello,

I was able to successfully run HiC-Pro on my files, and in order to save time I split the files into 10 M reads chunks before processing it. The final bowtie and hic results yields about 33 files per sample. I'm wondering with respect to the .FiltPairs, SCPairs, DEPairs, and .validPairs files in the data directory, how should I combine it to one file (is it a simple cat'ing of the files?). Are the DEPairs files I use for further processing?

Yonatan

yamzaleg avatar Oct 27 '21 20:10 yamzaleg

Hi Yonatan, Only the validPairs files are merged to generate the allValidPairs file and to construct the contact maps. This is a simple cat ... with an additional step to remove the duplicated reads. Best

nservant avatar Oct 27 '21 20:10 nservant

Hi!

Thank you so much. I also wanted to try processing data with FitHiChIP for purposes of differential looping and significant CIS interacting peaks. They suggest using a Peak file along with the Validpairs file, which hey said you can derive from calling MACS2 on the bam files. I have two questions: 1) which bam files do I use as I have three directories produced in bowtie_alignment directory? 2) as I split the files before do I just samtools merge them (this one seems obvious, but I want to make sure)?

Yonatan

yamzaleg avatar Nov 04 '21 06:11 yamzaleg