pySCENIC icon indicating copy to clipboard operation
pySCENIC copied to clipboard

arboreto_with_multiprocessing.py not able to use specified threads

Open MarcusLCC opened this issue 1 year ago • 6 comments

Hi, thanks for developing this amazing method

I'm using pySCENIC (0.12.1) in linux system on HPC (with 80 cores and 500GB RAM). It's a conda environment where the pySCENIC is installed via pip

My expression input is a loom file ~ 1.1GB in size with 27k genes and 60k cells

I've tried the CLI version with code

loom_path="seurat_ds_5k.loom"
tf_file="allTFs_hg38.txt"
outdir="18_pyscenic/20230321_downsample5k"

pyscenic grn ${loom_path} ${tf_file} -o ${outdir}/adj.csv --num_workers 20 > ${outdir}/log 2>&1 &

, where I got the following warnings (though it's still running)

2023-03-19 20:40:21,537 - pyscenic.cli.pyscenic - INFO - Loading expression matrix.

2023-03-19 20:44:19,332 - pyscenic.cli.pyscenic - INFO - Inferring regulatory networks.
2023-03-19 22:49:16,009 - distributed.worker - WARNING - Could not find data: {'ndarray-598b1d00ac144024c974f21a1bb7e818': ['tcp://127.0.0.1:44201', 'tcp://127.0.0.1:46424', 'tcp://127.0.0.1:41338', 'tcp://127.0.0.1:38435', 'tcp://127.0.0.1:41291', 'tcp://127.0.0.1:41161', 'tcp://127.0.0.1:37953', 'tcp://127.0.0.1:42931', 'tcp://127.0.0.1:40679']} on workers: [] (who_has: {'ndarray-598b1d00ac144024c974f21a1bb7e818': ['tcp://127.0.0.1:44201', 'tcp://127.0.0.1:46424', 'tcp://127.0.0.1:41338', 'tcp://127.0.0.1:38435', 'tcp://127.0.0.1:41291', 'tcp://127.0.0.1:41161', 'tcp://127.0.0.1:37953', 'tcp://127.0.0.1:42931', 'tcp://127.0.0.1:40679']})
2023-03-19 22:49:16,013 - distributed.scheduler - WARNING - Worker tcp://127.0.0.1:42958 failed to acquire keys: {'ndarray-598b1d00ac144024c974f21a1bb7e818': ('tcp://127.0.0.1:44201', 'tcp://127.0.0.1:46424', 'tcp://127.0.0.1:41338', 'tcp://127.0.0.1:38435', 'tcp://127.0.0.1:41291', 'tcp://127.0.0.1:41161', 'tcp://127.0.0.1:37953', 'tcp://127.0.0.1:42931', 'tcp://127.0.0.1:40679')}`

I then switched to using arboreto_with_multiprocessing.py using the following code:

arboreto_with_multiprocessing.py ${loom_path} ${tf_file} --method grnboost2 --output ${outdir}/adj.tsv --num_workers 15 --seed 777 > log 2>&1 &

When using monitor like htop to see the actuall cpu and memory usage, the programme actually only runs on 3 cores most of the time, and changing the --num_workers 15 to some other values like 5 or 20 doesn't actually make any difference (it still runs on ~3 cores). As the progress seems to be slow, I'm wondering if I'm doing some of my steps wrong.

May I have your advice on it? Advice on both CLI version's warning message and arboreto_with_multiprocessing.py multicores issue is much appreciated. Many thanks!

Best, Marcus

MarcusLCC avatar Mar 22 '23 00:03 MarcusLCC

the same issues, do you have a solution?

qiruicheng avatar Apr 11 '23 04:04 qiruicheng

Same issue

carlos-a-enriquez avatar Apr 11 '23 17:04 carlos-a-enriquez

Same problem

wbrett87 avatar Apr 23 '23 15:04 wbrett87

Same problem. Solved by installing older versions of numpy & pyscenic.

Name Version Build Channel

_libgcc_mutex 0.1 main http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main _openmp_mutex 5.1 1_gnu http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main aiohttp 3.8.4 pypi_0 pypi aiosignal 1.3.1 pypi_0 pypi arboreto 0.1.6 pypi_0 pypi async-timeout 4.0.2 pypi_0 pypi asynctest 0.13.0 pypi_0 pypi attrs 23.1.0 pypi_0 pypi bokeh 2.4.3 pypi_0 pypi boltons 23.0.0 pypi_0 pypi ca-certificates 2023.01.10 h06a4308_0 http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main certifi 2022.12.7 py37h06a4308_0 http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main charset-normalizer 3.1.0 pypi_0 pypi click 8.1.3 pypi_0 pypi cloudpickle 2.2.1 pypi_0 pypi ctxcore 0.2.0 pypi_0 pypi cytoolz 0.12.1 pypi_0 pypi dask 2022.2.0 pypi_0 pypi dill 0.3.6 pypi_0 pypi distributed 2022.2.0 pypi_0 pypi frozendict 2.3.8 pypi_0 pypi frozenlist 1.3.3 pypi_0 pypi fsspec 2023.1.0 pypi_0 pypi h5py 3.8.0 pypi_0 pypi heapdict 1.0.1 pypi_0 pypi idna 3.4 pypi_0 pypi importlib-metadata 6.6.0 pypi_0 pypi interlap 0.2.7 pypi_0 pypi jinja2 3.1.2 pypi_0 pypi joblib 1.2.0 pypi_0 pypi ld_impl_linux-64 2.38 h1181459_1 http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main libblas 3.9.0 16_linux64_openblas http://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge libcblas 3.9.0 16_linux64_openblas http://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge libffi 3.3 he6710b0_2 http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main libgcc-ng 11.2.0 h1234567_1 http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main libgfortran-ng 11.2.0 h00389a5_1 http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main libgfortran5 11.2.0 h1234567_1 http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main libgomp 11.2.0 h1234567_1 http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main liblapack 3.9.0 16_linux64_openblas http://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge libopenblas 0.3.21 h043d6bf_0 http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main libstdcxx-ng 11.2.0 h1234567_1 http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main llvmlite 0.39.1 pypi_0 pypi locket 1.0.0 pypi_0 pypi loompy 3.0.7 pypi_0 pypi markupsafe 2.1.2 pypi_0 pypi msgpack 1.0.5 pypi_0 pypi multidict 6.0.4 pypi_0 pypi multiprocessing-on-dill 3.5.0a4 pypi_0 pypi ncurses 6.4 h6a678d5_0 http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main networkx 2.6.3 pypi_0 pypi numba 0.56.4 pypi_0 pypi numexpr 2.8.4 pypi_0 pypi numpy 1.19.5 py37h3e96413_3 http://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge numpy-groupies 0.9.22 pypi_0 pypi openssl 1.1.1t h7f8727e_0 http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main packaging 23.1 pypi_0 pypi pandas 1.3.5 pypi_0 pypi partd 1.4.0 pypi_0 pypi pillow 9.5.0 pypi_0 pypi pip 22.3.1 py37h06a4308_0 http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main psutil 5.9.5 pypi_0 pypi pyarrow 12.0.0 pypi_0 pypi pynndescent 0.5.10 pypi_0 pypi pyscenic 0.12.0 pypi_0 pypi python 3.7.9 h7579374_0 http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main python-dateutil 2.8.2 pypi_0 pypi python_abi 3.7 2_cp37m http://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge pytz 2023.3 pypi_0 pypi pyyaml 6.0 pypi_0 pypi readline 8.2 h5eee18b_0 http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main requests 2.30.0 pypi_0 pypi scikit-learn 1.0.2 pypi_0 pypi scipy 1.7.3 pypi_0 pypi setuptools 65.6.3 py37h06a4308_0 http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main six 1.16.0 pypi_0 pypi sortedcontainers 2.4.0 pypi_0 pypi sqlite 3.41.2 h5eee18b_0 http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main tblib 1.7.0 pypi_0 pypi threadpoolctl 3.1.0 pypi_0 pypi tk 8.6.12 h1ccaba5_0 http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main toolz 0.12.0 pypi_0 pypi tornado 6.2 pypi_0 pypi tqdm 4.65.0 pypi_0 pypi typing-extensions 4.5.0 pypi_0 pypi umap-learn 0.5.3 pypi_0 pypi urllib3 2.0.2 pypi_0 pypi wheel 0.38.4 py37h06a4308_0 http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main xz 5.4.2 h5eee18b_0 http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main yarl 1.9.2 pypi_0 pypi zict 2.2.0 pypi_0 pypi zipp 3.15.0 pypi_0 pypi zlib 1.2.13 h5eee18b_0 http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main

tyxdavid avatar May 11 '23 10:05 tyxdavid

Thanks so much!!!!!! This fixed it for me too... I just pulled the pyscenic 0.12.0 from docker hub and it works like a charm

wbrett87 avatar May 11 '23 15:05 wbrett87

Try the containerized versions of pySCENIC: https://pyscenic.readthedocs.io/en/latest/installation.html#docker-podman-and-singularity-apptainer-images

ghuls avatar Jun 13 '23 10:06 ghuls