pb-metagenomics-tools icon indicating copy to clipboard operation
pb-metagenomics-tools copied to clipboard

Error in rule Checkm2Database

Open Rafa-Seong opened this issue 1 year ago • 6 comments

Name the workflow HiFi-MAG-Pipeline

Describe the bug Error in rule Checkm2Database

Expected behavior Expected the pipeline to run as normal

Screenshots [Tue Oct 24 10:29:41 2023] localrule Checkm2Database: input: /data0/hifi/new_hifi/pb-metagenomics-tools-2.1.0/HiFi-MAG-Pipeline/inputs/revio_all.contigs.fasta output: /data0/hifi/new_hifi/pb-metagenomics-tools-2.1.0/HiFi-MAG-Pipeline/CheckM2_database/uniref100.KO.1.dmnd, /data0/hifi/new_hifi/pb-metagenomics-tools-2.1.0/HiFi-MAG-Pipeline/CheckM2_database/CheckM2.complete.txt log: /data0/hifi/new_hifi/pb-metagenomics-tools-2.1.0/HiFi-MAG-Pipeline/logs/Checkm2Database.log jobid: 15 benchmark: /data0/hifi/new_hifi/pb-metagenomics-tools-2.1.0/HiFi-MAG-Pipeline/benchmarks/Checkm2Database.tsv reason: Missing output files: /data0/hifi/new_hifi/pb-metagenomics-tools-2.1.0/HiFi-MAG-Pipeline/CheckM2_database/uniref100.KO.1.dmnd resources: tmpdir=/tmp

Activating conda environment: .snakemake/conda/38b2454ccc60e533a4b4041ae242f4cc [Tue Oct 24 10:29:43 2023] Error in rule Checkm2Database: jobid: 15 output: /data0/hifi/new_hifi/pb-metagenomics-tools-2.1.0/HiFi-MAG-Pipeline/CheckM2_database/uniref100.KO.1.dmnd, /data0/hifi/new_hifi/pb-metagenomics-tools-2.1.0/HiFi-MAG-Pipeline/CheckM2_database/CheckM2.complete.txt log: /data0/hifi/new_hifi/pb-metagenomics-tools-2.1.0/HiFi-MAG-Pipeline/logs/Checkm2Database.log (check log file(s) for error message) conda-env: /data0/hifi/new_hifi/pb-metagenomics-tools-2.1.0/HiFi-MAG-Pipeline/.snakemake/conda/38b2454ccc60e533a4b4041ae242f4cc shell: checkm2 database --download --path /data0/hifi/new_hifi/pb-metagenomics-tools-2.1.0/HiFi-MAG-Pipeline &> /data0/hifi/new_hifi/pb-metagenomics-tools-2.1.0/HiFi-MAG-Pipeline/logs/Checkm2Database.log && touch /data0/hifi/new_hifi/pb-metagenomics-tools-2.1.0/HiFi-MAG-Pipeline/CheckM2_database/CheckM2.complete.txt (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Complete log: .snakemake/log/2023-10-24T102939.524220.snakemake.log

Log files Traceback (most recent call last): File "/data0/hifi/new_hifi/pb-metagenomics-tools-2.1.0/HiFi-MAG-Pipeline/.snakemake/conda/38b2454ccc60e533a4b4041ae242f4cc/bin/checkm2", line 27, in from checkm2 import predictQuality File "/data0/hifi/new_hifi/pb-metagenomics-tools-2.1.0/HiFi-MAG-Pipeline/.snakemake/conda/38b2454ccc60e533a4b4041ae242f4cc/lib/python3.6/site-packages/checkm2/predictQuality.py", line 1, in from checkm2 import modelProcessing File "/data0/hifi/new_hifi/pb-metagenomics-tools-2.1.0/HiFi-MAG-Pipeline/.snakemake/conda/38b2454ccc60e533a4b4041ae242f4cc/lib/python3.6/site-packages/checkm2/modelProcessing.py", line 17, in from tensorflow import keras File "/data0/hifi/new_hifi/pb-metagenomics-tools-2.1.0/HiFi-MAG-Pipeline/.snakemake/conda/38b2454ccc60e533a4b4041ae242f4cc/lib/python3.6/site-packages/tensorflow/init.py", line 41, in from tensorflow.python.tools import module_util as _module_util File "/data0/hifi/new_hifi/pb-metagenomics-tools-2.1.0/HiFi-MAG-Pipeline/.snakemake/conda/38b2454ccc60e533a4b4041ae242f4cc/lib/python3.6/site-packages/tensorflow/python/init.py", line 41, in from tensorflow.python.eager import context File "/data0/hifi/new_hifi/pb-metagenomics-tools-2.1.0/HiFi-MAG-Pipeline/.snakemake/conda/38b2454ccc60e533a4b4041ae242f4cc/lib/python3.6/site-packages/tensorflow/python/eager/context.py", line 28, in from absl import logging File "/data0/hifi/new_hifi/pb-metagenomics-tools-2.1.0/HiFi-MAG-Pipeline/.snakemake/conda/38b2454ccc60e533a4b4041ae242f4cc/lib/python3.6/site-packages/absl/logging/init.py", line 97, in from absl import flags File "/data0/hifi/new_hifi/pb-metagenomics-tools-2.1.0/HiFi-MAG-Pipeline/.snakemake/conda/38b2454ccc60e533a4b4041ae242f4cc/lib/python3.6/site-packages/absl/flags/init.py", line 35, in from absl.flags import _argument_parser File "/data0/hifi/new_hifi/pb-metagenomics-tools-2.1.0/HiFi-MAG-Pipeline/.snakemake/conda/38b2454ccc60e533a4b4041ae242f4cc/lib/python3.6/site-packages/absl/flags/_argument_parser.py", line 82, in class ArgumentParser(Generic[_T], metaclass=_ArgumentParserCache): TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases

Rafa-Seong avatar Oct 24 '23 02:10 Rafa-Seong

I am having the same issue as mentioned above.

Traceback (most recent call last):
  File "/data1/snakemake/pb-metagenomics-tools/HiFi-MAG-Pipeline/.snakemake/conda/5ecc75c830d4c67a2636691686d458e0_/bin/checkm2", line 27, in <module>
    from checkm2 import predictQuality
  File "/data1/snakemake/pb-metagenomics-tools/HiFi-MAG-Pipeline/.snakemake/conda/5ecc75c830d4c67a2636691686d458e0_/lib/python3.6/site-packages/checkm2/predictQuality.py", line 1, in <module>
    from checkm2 import modelProcessing
  File "/data1/snakemake/pb-metagenomics-tools/HiFi-MAG-Pipeline/.snakemake/conda/5ecc75c830d4c67a2636691686d458e0_/lib/python3.6/site-packages/checkm2/modelProcessing.py", line 17, in <module>
    from tensorflow import keras
  File "/data1/snakemake/pb-metagenomics-tools/HiFi-MAG-Pipeline/.snakemake/conda/5ecc75c830d4c67a2636691686d458e0_/lib/python3.6/site-packages/tensorflow/__init__.py", line 41, in <module>
    from tensorflow.python.tools import module_util as _module_util
  File "/data1/snakemake/pb-metagenomics-tools/HiFi-MAG-Pipeline/.snakemake/conda/5ecc75c830d4c67a2636691686d458e0_/lib/python3.6/site-packages/tensorflow/python/__init__.py", line 41, in <module>
    from tensorflow.python.eager import context
  File "/data1/snakemake/pb-metagenomics-tools/HiFi-MAG-Pipeline/.snakemake/conda/5ecc75c830d4c67a2636691686d458e0_/lib/python3.6/site-packages/tensorflow/python/eager/context.py", line 28, in <module>
    from absl import logging
  File "/data1/snakemake/pb-metagenomics-tools/HiFi-MAG-Pipeline/.snakemake/conda/5ecc75c830d4c67a2636691686d458e0_/lib/python3.6/site-packages/absl/logging/__init__.py", line 97, in <module>
    from absl import flags
  File "/data1/snakemake/pb-metagenomics-tools/HiFi-MAG-Pipeline/.snakemake/conda/5ecc75c830d4c67a2636691686d458e0_/lib/python3.6/site-packages/absl/flags/__init__.py", line 35, in <module>
    from absl.flags import _argument_parser
  File "/data1/snakemake/pb-metagenomics-tools/HiFi-MAG-Pipeline/.snakemake/conda/5ecc75c830d4c67a2636691686d458e0_/lib/python3.6/site-packages/absl/flags/_argument_parser.py", line 82, in <module>
    class ArgumentParser(Generic[_T], metaclass=_ArgumentParserCache):
TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases

nallsing-salk avatar Nov 21 '23 17:11 nallsing-salk

So, the environments will fail to install on our HPC with strict channel priorities (which is why I love Docker/Nextflow so much these days ...). I think incompatible packages are installed leading to this error.

I was able to workaround this by editing the check2m.yml and defining packages that are compatible per the checkm2 yml on their github and specifying python 3.8 ... I still had to disable strict priorities for it to resolve:

GNU nano 4.8 envs/checkm2.yml name: checkm2_env channels:

  • bioconda
  • conda-forge
  • defaults dependencies:
  • checkm2 == 1.0.1
  • python == 3.8
  • scikit-learn=0.23.2
  • h5py=2.10.0
  • numpy=1.19.2
  • diamond=2.0.4
  • tensorflow >= 2.2.0, <2.6.0
  • lightgbm=3.2.1
  • pandas=1.4.0
  • scipy=1.8.0
  • prodigal=2.6.3
  • setuptools
  • requests
  • packaging
  • tqdm

MicroSeq avatar Nov 29 '23 18:11 MicroSeq

Hi @Rafa-Seong , @nallsing-salk , and @MicroSeq , Thanks for your patience. This might be related to an issue with CheckM2 and specifically the Zenodo API. See thread here: https://github.com/chklovski/CheckM2/issues/83

It may or may not have been resolved.

I would recommend removing the existing conda environment and trying to re-install. Please let me know if this issue persists.

I would prefer to keep the conda recipe as simple as possible, as pinning versions may work for some systems but not others.

dportik avatar Dec 06 '23 18:12 dportik

I had the same issue. @MicroSeq 's solution worked for me but I guess it's not preferable in the long run. Maybe making it possible to download the CheckM2 db manually and point to it in config as you have for GTDB could be a feature worth adding in time?

CJREID avatar Feb 28 '24 00:02 CJREID

I had the same issue. @MicroSeq 's solution worked for me but I guess it's not preferable in the long run. Maybe making it possible to download the CheckM2 db manually and point to it in config as you have for GTDB could be a feature worth adding in time?

This would be the best solution as many HPC systems are configured without internet access on the nodes.

MicroSeq avatar Feb 28 '24 22:02 MicroSeq

Update: I removed the Checkm2Database rule and modified Checkm2ContigAnalysis to take a pre-downloaded DIAMOND database, then ran with the original checkm2.yaml conda env (i.e. not the workaround described by @MicroSeq ) and got the same error as @nallsing-salk and @Rafa-Seong so I think this means its an issue with something in the checkm2 conda env and not the Zenodo API?

Branch available here if anyone wants to confirm they get the same error.

CJREID avatar Mar 01 '24 03:03 CJREID

I am unable to reproduce errors related to the checkm2 conda env, but keep me posted on any progress or continued challenges here.

Downloading the checkm2 database manually before beginning the workflow is now required, fixed in #79.

dportik avatar Jun 27 '24 21:06 dportik