pySCENIC
pySCENIC copied to clipboard
distributed.core - ERROR
/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/arboreto/algo.py:214: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.
expression_matrix = expression_data.as_matrix()
creating dask graph
48 partitions
computing dask graph
distributed.protocol.core - CRITICAL - Failed to deserialize
Traceback (most recent call last):
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/distributed/protocol/core.py", line 108, in loads
msg = loads_msgpack(small_header, small_payload)
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/distributed/protocol/core.py", line 197, in loads_msgpack
return msgpack.loads(payload, use_list=False, **msgpack_raw_false)
File "msgpack/_unpacker.pyx", line 187, in msgpack._cmsgpack.unpackb
ValueError: 102826 exceeds max_map_len(32768)
distributed.core - ERROR - 102826 exceeds max_map_len(32768)
Traceback (most recent call last):
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/distributed/core.py", line 386, in handle_stream
msgs = yield comm.read()
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/tornado/gen.py", line 1141, in run
yielded = self.gen.throw(*exc_info)
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/distributed/comm/tcp.py", line 206, in read
deserializers=deserializers)
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/tornado/gen.py", line 326, in wrapper
yielded = next(result)
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/distributed/comm/utils.py", line 79, in from_frames
res = _from_frames()
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/distributed/comm/utils.py", line 65, in _from_frames
deserializers=deserializers)
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/distributed/protocol/core.py", line 108, in loads
msg = loads_msgpack(small_header, small_payload)
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/distributed/protocol/core.py", line 197, in loads_msgpack
return msgpack.loads(payload, use_list=False, **msgpack_raw_false)
File "msgpack/_unpacker.pyx", line 187, in msgpack._cmsgpack.unpackb
ValueError: 102826 exceeds max_map_len(32768)
shutting down client and local cluster
distributed.core - ERROR - 102826 exceeds max_map_len(32768)
Traceback (most recent call last):
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/distributed/core.py", line 346, in handle_comm
result = yield result
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/tornado/gen.py", line 1147, in run
yielded = self.gen.send(value)
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/distributed/scheduler.py", line 2018, in add_client
yield self.handle_stream(comm=comm, extra={'client': client})
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/tornado/gen.py", line 1141, in run
yielded = self.gen.throw(*exc_info)
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/distributed/core.py", line 386, in handle_stream
msgs = yield comm.read()
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/tornado/gen.py", line 1141, in run
yielded = self.gen.throw(*exc_info)
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/distributed/comm/tcp.py", line 206, in read
deserializers=deserializers)
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/tornado/gen.py", line 326, in wrapper
yielded = next(result)
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/distributed/comm/utils.py", line 79, in from_frames
res = _from_frames()
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/distributed/comm/utils.py", line 65, in _from_frames
deserializers=deserializers)
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/distributed/protocol/core.py", line 108, in loads
msg = loads_msgpack(small_header, small_payload)
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/distributed/protocol/core.py", line 197, in loads_msgpack
return msgpack.loads(payload, use_list=False, **msgpack_raw_false)
File "msgpack/_unpacker.pyx", line 187, in msgpack._cmsgpack.unpackb
ValueError: 102826 exceeds max_map_len(32768)
tornado.application - ERROR - Exception in Future <Future cancelled> after timeout
Traceback (most recent call last):
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/tornado/gen.py", line 970, in error_callback
future.result()
concurrent.futures._base.CancelledError
distributed.comm.tcp - WARNING - Closing dangling stream in <TCP local=tcp://127.0.0.1:33045 remote=tcp://127.0.0.1:58286>
distributed.comm.tcp - WARNING - Closing dangling stream in <TCP local=tcp://127.0.0.1:33435 remote=tcp://127.0.0.1:58286>
distributed.comm.tcp - WARNING - Closing dangling stream in <TCP local=tcp://127.0.0.1:33481 remote=tcp://127.0.0.1:58286>
distributed.comm.tcp - WARNING - Closing dangling stream in <TCP local=tcp://127.0.0.1:33482 remote=tcp://127.0.0.1:58286>
distributed.comm.tcp - WARNING - Closing dangling stream in <TCP local=tcp://127.0.0.1:33483 remote=tcp://127.0.0.1:58286>
distributed.comm.tcp - WARNING - Closing dangling stream in <TCP local=tcp://127.0.0.1:33484 remote=tcp://127.0.0.1:58286>
distributed.comm.tcp - WARNING - Closing dangling stream in <TCP local=tcp://127.0.0.1:33485 remote=tcp://127.0.0.1:58286>
finished
Traceback (most recent call last):
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/distributed/client.py", line 1487, in _gather
st = self.futures[key]
KeyError: 'finalize-7b1845663f7c382673df6fc49437374f'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "script.py", line 44, in <module>
adjacencies = grnboost2(ex_matrix, tf_names=tf_names, verbose=True)
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/arboreto/algo.py", line 41, in grnboost2
early_stop_window_length=early_stop_window_length, limit=limit, seed=seed, verbose=verbose)
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/arboreto/algo.py", line 135, in diy
.compute(graph, sync=True) \
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/distributed/client.py", line 2492, in compute
result = self.gather(futures)
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/distributed/client.py", line 1652, in gather
asynchronous=asynchronous)
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/distributed/client.py", line 670, in sync
return sync(self.loop, func, *args, **kwargs)
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/distributed/utils.py", line 277, in sync
six.reraise(*error[0])
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/six.py", line 693, in reraise
raise value
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/distributed/utils.py", line 262, in f
result[0] = yield future
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/tornado/gen.py", line 1141, in run
yielded = self.gen.throw(*exc_info)
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/distributed/client.py", line 1493, in _gather
None)
File "/home/user/anaconda3/envs/pyscenic_env/lib/python3.6/site-packages/six.py", line 693, in reraise
raise value
concurrent.futures._base.CancelledError: finalize-7b1845663f7c382673df6fc49437374f
How could I resolve this issue? Any suggestion helps.. Thank you very much. EL
I have the same issue (distributed.protocol.core - CRITICAL - Failed to deserialize AND _concurrent.futures._base.CancelledError), look like tornado package is involved but I have the latest version? Any help would be very appreciated, thank you!
Dear,
I did some research and the problem potentially resides in the dask.distributed package (see https://github.com/dask/distributed/issues/1830). The author of the dask framework, Matthew Rocklin, advises to downgrade the version of the tornado package to version 4.5.
Could you create a virgin (miniconda) environment in the following way and check if the problem persists?
For Linux OS:
conda create -n pyscenic python=3.6
. activate pyscenic
pip install tornado==4.5
pip install pyscenic
For Windows:
conda create -n pyscenic python=3.6
activate pyscenic
pip install tornado==4.5
pip install pyscenic
If this resolves the issue I'll put the fix in the source code and create a new release of pyscenic. Many thanks.
Kindest regards, Bram
Hi , thanks for the quick answear. I downgraded tornado to 4.5 but now I get another error with tornado:
tornado.application - ERROR - Exception in callback functools.partial(<function wrap.
Dear Hicham,
I tried to reproduce the problem using a fresh installation of pyscenic (version 0.8.16) in a virgin miniconda environment (on a Linux RedHat distribution running on a dual Intel Xeon E5-2680 v3 machine). I installed the latest version of tornado (5.1.1) and ran grnboost from the command line. I failed to get this error message.
This is my list of packages installed:
arboreto==0.1.5
attrs==18.2.0
boltons==18.0.1
certifi==2018.10.15
Click==7.0
cloudpickle==0.6.1
cycler==0.10.0
cytoolz==0.9.0.1
dask==0.20.2
decorator==4.3.0
dill==0.2.8.2
distributed==1.24.2
frozendict==1.2
h5py==2.8.0
HeapDict==1.0.0
interlap==0.2.6
kiwisolver==1.0.1
llvmlite==0.25.0
loompy==2.0.2
matplotlib==3.0.2
msgpack==0.5.6
multiprocessing-on-dill==3.5.0a4
networkx==2.2
numba==0.40.1
numpy==1.15.4
pandas==0.23.4
psutil==5.4.8
pyarrow==0.11.1
pyparsing==2.3.0
pyscenic==0.8.16
python-dateutil==2.7.5
pytz==2018.7
PyYAML==3.13
scikit-learn==0.20.1
scipy==1.1.0
six==1.11.0
sortedcontainers==2.1.0
tblib==1.3.2
toolz==0.9.0
tornado==5.1.1
tqdm==4.28.1
typing==3.6.6
umap-learn==0.3.6
zict==0.1.3
Anyhow, this problem is related to the GRNBoost2 step of pySCENIC which is provided through the arboreto package. This issue is already registered there: https://github.com/tmoerman/arboreto/issues/12 .
Kindest regards, Bram
Dear Bram, Thank you for the kind advices. I still have a tornado error with this list of package:
arboreto 0.1.5
import pandas as pd from distributed import Client, LocalCluster from arboreto.utils import load_tf_names from arboreto.algo import grnboost2 if name == 'main': in_file='DGE.tsv' tf_file='TFs.txt' out_file='grnboost2_output.csv' # ex_matrix is a DataFrame with gene names as column names ex_matrix = pd.read_csv(in_file, sep='\t') # tf_names is read using a utility function included in Arboreto tf_names = load_tf_names(tf_file) # instantiate a custom Dask distributed Client client = Client(LocalCluster(n_workers=28,memory_limit=4e9)) # compute the GRN network = grnboost2(expression_data=ex_matrix, tf_names=tf_names, client_or_address=client) # write the GRN to file network.to_csv(out_file, sep='\t', index=False, header=False) <
I will open an issue on arboreto github, thanks again!
Hicham
Dear Hicham,
I tried to reproduce the problem using a fresh installation of pyscenic (version 0.8.16) in a virgin miniconda environment (on a Linux RedHat distribution running on a dual Intel Xeon E5-2680 v3 machine). I installed the latest version of tornado (5.1.1) and ran grnboost from the command line. I failed to get this error message.
This is my list of packages installed:
arboreto==0.1.5 attrs==18.2.0 boltons==18.0.1 certifi==2018.10.15 Click==7.0 cloudpickle==0.6.1 cycler==0.10.0 cytoolz==0.9.0.1 dask==0.20.2 decorator==4.3.0 dill==0.2.8.2 distributed==1.24.2 frozendict==1.2 h5py==2.8.0 HeapDict==1.0.0 interlap==0.2.6 kiwisolver==1.0.1 llvmlite==0.25.0 loompy==2.0.2 matplotlib==3.0.2 msgpack==0.5.6 multiprocessing-on-dill==3.5.0a4 networkx==2.2 numba==0.40.1 numpy==1.15.4 pandas==0.23.4 psutil==5.4.8 pyarrow==0.11.1 pyparsing==2.3.0 pyscenic==0.8.16 python-dateutil==2.7.5 pytz==2018.7 PyYAML==3.13 scikit-learn==0.20.1 scipy==1.1.0 six==1.11.0 sortedcontainers==2.1.0 tblib==1.3.2 toolz==0.9.0 tornado==5.1.1 tqdm==4.28.1 typing==3.6.6 umap-learn==0.3.6 zict==0.1.3
Anyhow, this problem is related to the GRNBoost2 step of pySCENIC which is provided through the arboreto package. This issue is already registered there: tmoerman/arboreto#12 .
Kindest regards, Bram
Hi Bram, I followed the instruction in https://pyscenic.readthedocs.io/en/latest/#run-genie3-or-grnboost-from-arboreto-to-infer-co-expression-modules, and ended up with the error "concurrent.futures._base.CancelledError" mentioned above when running the grnboost2 function. I'm quite new to tornado. And what I want to know is that is there any alternative function or certain workaround to avoid this error? Thanks! Best regards, Yingyong
I also have this issue. Has anyone found a solution to it?
Updated: I managed to run it in a different cluster in the end (no conda)
Hi, I ended up using a hybrid execution of pySCENIC and SCENIC R where GRNBoost2 is the only part run through python.
My python modules:
aniso8601==4.0.1 annoy==1.15.1 arboreto==0.1.5 arff==0.9 asn1crypto==0.24.0 attrs==19.1.0 backcall==0.1.0 backports-abc==0.5 backports.shutil-get-terminal-size==1.0.0 bcrypt==3.1.4 biopython==1.72 bitstring==3.1.5 bleach==2.1.3 blist==1.3.6 boltons==19.1.0 CellPhoneDB==1.1.0 certifi==2018.11.29 cffi==1.11.5 chardet==3.0.4 click==6.7 cloudpickle==0.8.1 cryptography==2.3 cycler==0.10.0 Cython==0.29 cytoolz==0.9.0.1 dask==1.0.0 deap==1.2.2 decorator==4.4.0 dill==0.2.9 distributed==1.26.1 ecdsa==0.13 entrypoints==0.2.3 Flask==1.0.2 Flask-RESTful==0.3.6 Flask-Testing==0.7.1 frozendict==1.2 funcsigs==1.0.2 gnureadline==6.3.8 h5py==2.9.0 HeapDict==1.0.0 html5lib==1.0.1 idna==2.7 interlap==0.2.6 ipykernel==4.8.2 ipython==6.4.0 ipython-genutils==0.2.0 ipywidgets==7.2.1 itsdangerous==1.1.0 jedi==0.12.1 Jinja2==2.10 jsonschema==2.6.0 jupyter-client==5.2.3 jupyter-core==4.4.0 kiwisolver==1.0.1 llvmlite==0.26.0 lockfile==0.12.2 loompy==2.0.2 louvain==0.6.1 MarkupSafe==1.1.0 matplotlib==3.0.2 mistune==0.8.3 mock==2.0.0 mpmath==1.0.0 msgpack==0.5.6 multiprocessing-on-dill==3.5.0a4 nbconvert==5.3.1 nbformat==4.4.0 netaddr==0.7.19 netifaces==0.10.7 networkx==2.2 nose==1.3.7 notebook==5.5.0 numba==0.41.0 numpy==1.16.0 pandas==0.23.4 pandocfilters==1.4.2 paramiko==2.4.1 parso==0.3.1 path.py==11.0.1 pathlib2==2.3.2 paycheck==1.0.2 pbr==4.1.0 pexpect==4.6.0 pickleshare==0.7.4 Pillow==5.3.0 prompt-toolkit==1.0.15 psutil==5.4.6 ptyprocess==0.6.0 pyarrow==0.11.1 pyasn1==0.4.3 pycparser==2.18 Pygments==2.2.0 PyNaCl==1.2.1 pyparsing==2.3.1 pyscenic==0.9.7 python-dateutil==2.7.5 python-igraph==0.7.1.post6 pytz==2018.7 PyWavelets==1.0.1 PyYAML==3.13 pyzmq==17.1.0 requests==2.19.1 scikit-image==0.14.2 scikit-learn==0.19.2 scipy==1.2.1 scrublet==0.2 seaborn==0.9.0 Send2Trash==1.5.0 simplegeneric==0.8.1 singledispatch==3.4.0.3 six==1.11.0 sortedcontainers==2.1.0 SQLAlchemy==1.2.14 sympy==1.2 tblib==1.3.2 terminado==0.8.1 testpath==0.3.1 toolz==0.9.0 tornado==5.1.1 tqdm==4.31.1 traitlets==4.3.2 typing==3.6.6 umap==0.1.1 umap-learn==0.3.2 urllib3==1.23 virtualenv==16.0.0 wcwidth==0.1.7 webencodings==0.5.1 Werkzeug==0.14.1 widgetsnbextension==3.2.1 zict==0.1.4
the code for GRNBoost2 python run :
#Author: Hicham Affia - CHU Sainte Justine research center
import os from distributed import LocalCluster, Client
if name == 'main': local_cluster = LocalCluster(n_workers=30,memory_limit=4e9) custom_client = Client(local_cluster) #Trick to get the ip adress of the client (custom_client_name) custom_client_str=str(custom_client) custom_client_str=custom_client_str.split("'") custom_client_str_2=custom_client_str[1].split("//") custom_client_name=custom_client_str_2[1] command_line="pyscenic grnboost -o grnboost2_output_sample_DGE.csv --num_workers 30 --client_or_address "+custom_client_name+" ~/path_to_DGE.tsv ~/path_to_TFs_file.tsv" os.system(command_line) print("GRNBOOST DONE") quit("yes")
Hope it helps!
Best, Hicham
Le mer. 7 juil. 2021 à 07:30, Ionelia @.***> a écrit :
I also have this issue. Has anyone found a solution to it?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/aertslab/pySCENIC/issues/28#issuecomment-875527219, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJAPJJTTZJ5STQ7GVEWVETLTWQ3GTANCNFSM4GHYPEZQ .
--
Hicham Affia, MSc Bio-informaticien Laboratoire de génétique cardiovasculaire du Dr Andelfinger
CHU Sainte Justine
Centre de Recherche, 5.17.000
3175 chemin de la Côte-Sainte-Catherine, Montréal, Québec, H3T 1C5