pypiper icon indicating copy to clipboard operation
pypiper copied to clipboard

NGSTK peak caller function signature homogenization

Open vreuter opened this issue 8 years ago • 3 comments

Homogenization seems to be such a :fire: theme right now. Anyway... It'd be nice to bring the signatures for the peak caller functions into closer alignment. There's a slight discrepancy both in names and order. Paths to treatment/control come first and are adjacent in sppCallPeaks while they're not adjacent in macs2CallPeaks. Additionally, we have plural versions of the parameter names in the MACS2 wrapper.

The first one's clearly super easy and I could just switch it, but I wanted to ask before doing so to get an estimate of the extent of usage these functions are? Significant beyond open_pipelines? For the second issue, it looks like the R script that the SPP function wraps doesn't support multiple input files for treatment and for control, hence the singular parameter names. The easy way out would be to singularize the names in the MACS2 wrapper, but the more interesting/perhaps better way would be to support multiple treatment and control paths in the SPP script.

  • [ ] Parameter order
  • [ ] Parameter names

vreuter avatar Aug 24 '17 19:08 vreuter

You're absolutely right. I can also try to handle this when I have a go at sorting out the naming scheme of the remaining NGSTk functions.

Regarding SPP, I have the feeling out there there isn't anyone really using it for peak calling on a routine basis. While I experimented a bit with it I have also never actually used it productively and don't plan to, so if no one is using SPP, I wouldn't bother adapting its script to support multiple inputs.

afrendeiro avatar Aug 26 '17 14:08 afrendeiro

OK thanks a lot for this info. I will not explore multiple inputs for SPP then. I'm also happy to make the adjustments to names and order for the peak calling wrappers so let me know if you want me to do that.

vreuter avatar Aug 26 '17 15:08 vreuter

Just realized the reason for the different order in parameter between MACS2 and SPP peak calling functions is that MACS2 is able to call peaks without control and therefore the control_bams argument is a keyword because it has default None, while SPP always requires a control. One could remove the default from control_bams in MACS and have the docstring specify that when passed None it will not use control, but having the default argument in the function call is in fact more intuitive/explicit. No strong preference though, up to you to decide @vreuter I guess :)

afrendeiro avatar Aug 28 '17 18:08 afrendeiro