CheckM2 icon indicating copy to clipboard operation
CheckM2 copied to clipboard

KeyError: 'key' while downloading database

Open yazhinia opened this issue 1 year ago • 6 comments

Hello, I get the following error when try to download the diamond database using this command checkm2 database --download --path /path_to_checkm2/checkm2/

Traceback (most recent call last): File "./checkm2", line 282, in fileManager.DiamondDB().download_database(args.path) File "~/software/checkm2/bin/../checkm2/fileManager.py", line 127, in download_database backpack_downloader.download_and_extract(download_location, DOI, progress_bar=True, no_check_version=False) File "~/software/checkm2/bin/../checkm2/zenodo_backpack.py", line 52, in download_and_extract fname = str(file['self']).split('/')[-1] KeyError: 'self'

Any suggestion is much appreciated.

Thank you.

Best, Yazhini

yazhinia avatar Oct 15 '23 17:10 yazhinia

Hello, I also get this error.

  • Project Version (or commit hash): CheckM2 version 1.0.2
  • Operating System:linux
  • Python Version: Python 3.8

Description

An error occurred while attempting to download the database to a specified path using the command checkm2 database --download. The detailed stack trace indicates a KeyError, which seems to happen during the processing of downloaded files.

Steps to Reproduce

Here are the steps that triggered this error:

  1. Activate the relevant conda environment: conda activate checkm2
  2. Run the database download command: checkm2 database --download --path /data_sata1/ngs/databases/soft_database/checkm2/ --debug

Expected Behavior

I expected the command to successfully download the database files without throwing an error.

Actual Behavior

The program terminated, showing the following error message:

Traceback (most recent call last): File "/home/ngs/miniconda3/envs/checkm2/bin/checkm2", line 4, in import('pkg_resources').run_script('CheckM2==1.0.2', 'checkm2') File "/home/ngs/miniconda3/envs/checkm2/lib/python3.8/site-packages/pkg_resources/init.py", line 722, in run_script self.require(requires)[0].run_script(script_name, ns) File "/home/ngs/miniconda3/envs/checkm2/lib/python3.8/site-packages/pkg_resources/init.py", line 1561, in run_script exec(code, namespace, namespace) File "/home/ngs/miniconda3/envs/checkm2/lib/python3.8/site-packages/CheckM2-1.0.2-py3.8.egg/EGG-INFO/scripts/checkm2", line 282, in fileManager.DiamondDB().download_database(args.path) File "/home/ngs/miniconda3/envs/checkm2/lib/python3.8/site-packages/CheckM2-1.0.2-py3.8.egg/checkm2/fileManager.py", line 127, in download_database backpack_downloader.download_and_extract(download_location, DOI, progress_bar=True, no_check_version=False) File "/home/ngs/miniconda3/envs/checkm2/lib/python3.8/site-packages/CheckM2-1.0.2-py3.8.egg/checkm2/zenodo_backpack.py", line 52, in download_and_extract fname = str(file['key']).split('/')[-1] KeyError: 'key'

Solutions Attempted

  • Ensured proper directory permissions with read and write access.
  • Attempted to update all packages within the conda environment.
  • Checked internet connectivity to ensure access to external resources.

Additional Context

Any information regarding potential connectivity issues or dependencies during the download attempt could be helpful in resolving this issue.

Thank you. Jihong

LiuJih2021 avatar Oct 16 '23 08:10 LiuJih2021

Was just about to post this myself, I'm also getting the same error.

I am wondering if something changed about the Zenodo API very recently, as it seems like the issue is related to the resulting metadata json.

execution (within a snakemake workflow):

checkm2 database --download --path /dept/appslab/projects/2023/dp_snake-HiFi-MAG-Pipeline/CheckM2_database

error:

[10/16/2023 10:44:35 AM] INFO: Command: Download database. Checking internal path information.
Traceback (most recent call last):
  File "/dept/appslab/projects/old/2023/dp_snake-HiFi-MAG-Pipeline/.snakemake/conda/37d52c45/bin/checkm2", line 280, in <module>
    fileManager.DiamondDB().download_database(args.path)
  File "/dept/appslab/projects/old/2023/dp_snake-HiFi-MAG-Pipeline/.snakemake/conda/37d52c45/lib/python3.8/site-packages/checkm2/fileManager.py", line 127, in download_database
    backpack_downloader.download_and_extract(download_location, DOI, progress_bar=True, no_check_version=False)
  File "/dept/appslab/projects/old/2023/dp_snake-HiFi-MAG-Pipeline/.snakemake/conda/37d52c45/lib/python3.8/site-packages/checkm2/zenodo_backpack.py", line 52, in download_and_extract
    fname = str(file['key']).split('/')[-1]
KeyError: 'key'

dportik avatar Oct 16 '23 18:10 dportik

Probably obvious to everyone but, as a workaround, you can download from: https://zenodo.org/records/5571251 , unpack and set the location manually

checkm2 database --setdblocation your_database_path/CheckM2_database/uniref100.KO.1.dmnd

sconlan avatar Oct 17 '23 20:10 sconlan

Thank you for raising the issue - will get to it shortly, as soon as I submit my PhD thesis (next week or two). In the meantime, you can download the database manually as suggested by @sconlan

chklovski avatar Oct 22 '23 23:10 chklovski

Possibly this is fixed now? The same error arising from https://github.com/wwood/singlem appears to have fixed itself - possibly this was due to transient issues with the zenodo API.

wwood avatar Nov 08 '23 04:11 wwood

@chklovski would you mind providing a link to the zenodo repo with the most current version of the diamond database in the readme? I'm at a research institution that employs man-in-the-middle attacks for external downloads, so it's much easier for me and folks at institutions with similar security protocols to manually download from a hosting repo with other tools. I see that it's above but this isn't really the optimal location for such info. Thanks!

itsmisterbrown avatar Dec 15 '23 21:12 itsmisterbrown