coda icon indicating copy to clipboard operation
coda copied to clipboard

examples of the input files to the train process without AQUAS

Open avilella opened this issue 7 years ago • 1 comments

Hi,

Are there examples of what the input files to the training process should look like if they haven't been produced by the AQUAS pipeline? E.g. starting from a file like file1_dedup.bam file, what are the peaks and signal tracks files that are needed, and how are they produced by MACS2?

Thanks.

avilella avatar Jul 14 '17 12:07 avilella

Hi! Sorry for the late response; just got back from traveling.

We don't have a good example of input files without going through AQUAS, sorry, but you could try to download and extract our processed data to get a sense. You can also look at the prepData.py file and specifically at the run_pipeline_commands() function (line 956, https://github.com/kundajelab/coda/blob/master/prepData.py#L956). That function is essentially a wrapper that makes calls to the AQUAS pipeline via the included shell scripts; once you see which parameters we're passing to the AQUAS pipeline, you can then map those to actual MACS2 calls via the AQUAS documentation (https://docs.google.com/document/d/1lG_Rd7fnYgRpSIqrIfuVlAz2dW1VaSQThzk836Db99c/edit#heading=h.9ecc41kilcvq contains the calls to MACS2 that AQUAS does).

Hope that helps!

kohpangwei avatar Jul 28 '17 21:07 kohpangwei