ACEseqWorkflow icon indicating copy to clipboard operation
ACEseqWorkflow copied to clipboard

Installation failure caused by conda confliction

Open ysbioinfo opened this issue 3 years ago • 10 comments

HI, First thanks for developing such an awesome tool. I want to apply it on my WGS data to call somatic CNA. However, when I tried this step: conda env create -n ACEseqWorkflow -f $PATH_TO_PLUGIN_DIRECTORY/resources/analysisTools/copyNumberEstimationWorkflow/environments/conda.yml It seems a lot of conda packages, especially their versions are conflicted. And some package version couldn't be found in current conda as well. Do you heve some recommendations on how to solve this problem?

Many thanks!

Yang

ysbioinfo avatar May 14 '21 08:05 ysbioinfo

Hi Yang.

One problem with Conda is that it can be hard to reconstitute old environments. In principle it could be conflicts of packages of the newer Conda installation and requirements of the packages in the old environment. Or even the old packages have been removed from the channels. At least for bioconda packages you may include the bioconda-legacy channel. That might work.

I am currently also looking into collecting some information on the available Docker containers. The docs seem outdated here (sorry).

vinjana avatar May 19 '21 10:05 vinjana

Hi @YangShi-PKU

Here is another option, that may spare you a lot of hassle with the installation and setup. @suhrig has prepared a number of Docker containers of the workflow plus an archive with the required reference data. You can retrieve the data from https://ppcg.dkfz.de/pipelines/. The archives contain the following Roddy-based workflows

  • ACEseq 1.2.10
  • SNV 1.2.166-1
  • Indel 1.2.177-2

This means for the ACEseq Workflow this is not the newest version, but one that is pretty close to what we currently use in our production systems (1.2.8). The archives have been prepared for a data analysis in the context of a manuscript that is currently in preparation.

I hope this helps!

vinjana avatar May 21 '21 09:05 vinjana

Thanks, I'll try that!

Best

Yang

ysbioinfo avatar Jun 01 '21 07:06 ysbioinfo

Hi @YangShi-PKU

Here is another option, that may spare you a lot of hassle with the installation and setup. @suhrig has prepared a number of Docker containers of the workflow plus an archive with the required reference data. You can retrieve the data from https://ppcg.dkfz.de/pipelines/. The archives contain the following Roddy-based workflows

  • ACEseq 1.2.10
  • SNV 1.2.166-1
  • Indel 1.2.177-2

This means for the ACEseq Workflow this is not the newest version, but one that is pretty close to what we currently use in our production systems (1.2.8). The archives have been prepared for a data analysis in the context of a manuscript that is currently in preparation.

I hope this helps!

Hi,

Thanks for sharing the docker as I am also working on running your tool and getting conda conflicts.

I am wondering if the pipeline in the docker can run with the hg38?

Thank you very much.

Best wishes, Yuyao

YY-SONG0718 avatar Jul 22 '21 17:07 YY-SONG0718

Hi Yuyao,

The prebuilt containers are not compatible with hg38, unfortunately.

Regards, Sebastian

suhrig avatar Jul 23 '21 06:07 suhrig

Hi Yuyao,

The prebuilt containers are not compatible with hg38, unfortunately.

Regards, Sebastian

Hi Sebastian,

Thanks for the reply. Would it be possible to update the conda environment configure file of the ACEseq workflow (hg38)? If the .yml of a working version is exported freshly, I assume that the packages will be compatible.

Many thanks.

Best wishes, Yuyao

YY-SONG0718 avatar Jul 23 '21 07:07 YY-SONG0718

There is a hg38 branch in this repository. I think @NagaComBio is more qualified than me to answer what the status of this branch is and whether it is usable.

suhrig avatar Jul 23 '21 08:07 suhrig

Sorry, @YY-SONG0718 for the delayed response. @vinjana is trying to solve the hg19 conda issue, we will get back to the hg38 conda env issue ASAP. We usually use our local cluster environment for the development and production, so far we have paid little attention to the conda env files. We will try to solve this sooner.

NagaComBio avatar Jul 23 '21 08:07 NagaComBio

Hi @NagaComBio @suhrig. No problem, Thanks for being supportive :)

YY-SONG0718 avatar Jul 23 '21 10:07 YY-SONG0718

O.k., now after some testing we managed to get the Conda environment to work again with the newest hg38-enabled development version of the workflow (credits go to @NagaComBio!), but still I don't think this will be a good solution, because of some general problems with the Conda-based solution.

There are some old packages, which are lost in the used Conda channels. This means that it's not possible to reinstall a working environment just with conda env create -n ACEseq -f conda.yaml.

The only option that I see would be distributing a conda pack package. Unfortunately, at least for us this was not sufficient, because

  1. some packages needed manual reinstallation ( conda install --force-reinstall bioconductor-genomeinfodbdata bedtools==2.16.2), and
  2. (and worse) there is a bug in R that may cause problems if you run the workflow in parallel.

The only possible solution that I see in the moment to cope with these issues would be to containerize. I could also provide you with the conda pack, but will probably require some more time investment from your side.

vinjana avatar Dec 01 '21 12:12 vinjana