infercnvpy icon indicating copy to clipboard operation
infercnvpy copied to clipboard

Make nextflow pipeline to run copyKAT and SCEVAN

Open grst opened this issue 2 years ago • 2 comments

I feel rpy2 is unreliable, the R packages are cumbersome to install and a lot of additional dependencies. On top of that, the R methods are relatively slow compared to the infercnv algorithm in Python and it makes sense to run them in parallel on individual patients, potentially on HPC.

I would suggest to

  • remove the copykat function as it is currently from the package
  • build a small nextflow pipeline that
    • takes an anndata object as input
    • splits it up by patient (or whatever variable in obs)
    • converts it to SingleCellExperiment
    • runs copykat and or SCEVAN
    • has the dependencies packaged as docker/singularity containers.
  • provide loader functions for the results of copykat and SCEVAN such that the visualizations functions of infercnvpy can be used.

grst avatar Dec 20 '21 08:12 grst