pySCENIC icon indicating copy to clipboard operation
pySCENIC copied to clipboard

[BUG] Error in running pyscenic through docker image

Open thereallda opened this issue 2 years ago • 2 comments

Hi,

First, I would like to thanks for the great package and nice tutorial.

Describe the bug

I followed the protocol of PBMC10k. Everything went smooth before the pyscenic step. When I tried to run the pyscenic grn using docker image the error occurred.

Steps to reproduce the behavior

  1. Command run when the error occurred:
docker run -it --rm \
-v $PWD:/data aertslab/pyscenic:0.10.0 pyscenic grn \
--num_workers 20 \
-o /data/adj.tsv \
-m grnboost2 \
/data/PBMC10k_filtered.loom /data/hs_hgnc_tfs.txt

The current working directory contained the .loom file and tf list

$ ls
arboreto_with_multiprocessing.py
dask-worker-space
filtered_feature_bc_matrix
hg38__refseq-r80__10kb_up_and_down_tss.mc9nr.feather
hs_hgnc_tfs.txt
motifs-v9-nr.hgnc-m0.001-o0.0.tbl
PBMC10k_filtered.loom
pbmc_10k_v3_filtered_feature_bc_matrix.tar.gz
SCENIC_protocol.ipynb
  1. Error encountered:
2022-06-07 09:53:16,066 - pyscenic.cli.pyscenic - INFO - Loading expression matrix.

2022-06-07 09:53:17,727 - pyscenic.cli.pyscenic - INFO - Inferring regulatory networks.
/opt/venv/lib/python3.7/site-packages/dask/config.py:161: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  data = yaml.load(f.read()) or {}
preparing dask client
parsing input
/opt/venv/lib/python3.7/site-packages/arboreto/algo.py:214: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.
  expression_matrix = expression_data.as_matrix()
creating dask graph
20 partitions
computing dask graph
distributed.utils_perf - WARNING - full garbage collections took 10% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 10% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 10% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 10% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 10% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 10% CPU time recently (threshold: 10%)
distributed.nanny - WARNING - Worker exceeded 95% memory budget. Restarting
distributed.nanny - WARNING - Worker process 76 was killed by signal 15
distributed.scheduler - ERROR - Workers don't have promised key: ['tcp://127.0.0.1:40614'], finalize-b2b4ab88ca5e2b22e9fc9d537c38a67e
NoneType: None
distributed.client - WARNING - Couldn't gather 1 keys, rescheduling {'finalize-b2b4ab88ca5e2b22e9fc9d537c38a67e': ('tcp://127.0.0.1:40614',)}
distributed.nanny - WARNING - Restarting worker
distributed.utils_perf - WARNING - full garbage collections took 11% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 11% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
...
distributed.utils_perf - WARNING - full garbage collections took 10% CPU time recently (threshold: 10%)
distributed.nanny - WARNING - Worker exceeded 95% memory budget. Restarting
distributed.scheduler - ERROR - Workers don't have promised key: ['tcp://127.0.0.1:42724'], finalize-b2b4ab88ca5e2b22e9fc9d537c38a67e
NoneType: None
distributed.nanny - WARNING - Worker process 64 was killed by signal 15
distributed.client - WARNING - Couldn't gather 1 keys, rescheduling {'finalize-b2b4ab88ca5e2b22e9fc9d537c38a67e': ('tcp://127.0.0.1:42724',)}
distributed.nanny - WARNING - Restarting worker
...

I have checked the memory usage was around 24G when running pyscenic and my machine still have about 90G free memory.

Please complete the following information:

  • pySCENIC version: 0.10.0
  • Installation method: Docker
  • Run environment: CLI
  • OS: CentOS 7
  • Package versions: NA

Any help would be appreciated!

Thanks!

thereallda avatar Jun 07 '22 12:06 thereallda

Have you solved this question? I had the same question. I used the singularity container to run pyscenic(aertslab-pyscenic-0.9.18.sif &aertslab-pyscenic-0.12.1.sif) .

hyq9588 avatar Oct 07 '23 03:10 hyq9588

Have you solved this question? I had the same question. I used the singularity container to run pyscenic(aertslab-pyscenic-0.9.18.sif &aertslab-pyscenic-0.12.1.sif) .

Yes. I solved it by using the PBMC 3k dataset and the following commands.

docker run -it --rm -v $PWD:$PWD -w $PWD aertslab/pyscenic:0.10.0 pyscenic grn --num_workers 20 -o adj_pbmc.tsv -m grnboost2 pbmc3k.loom hs_hgnc_tfs.txt

thereallda avatar Oct 16 '23 12:10 thereallda