WRN No vacant cpu resources at the moment, will try 300 times later
I can't run any code with parl, always getting that error. This is how I start on my local machine, Windows 10:
xparl start --port 8010
# The Parl cluster is started at localhost:8010.
# A local worker with 8 CPUs is connected to the cluster.
# Starting the cluster monitor...
## If you want to check cluster status, please view:
http://192.168.1.99:61581
or call:
xparl status
## If you want to add more CPU resources, please call:
xparl connect --address 192.168.1.99:8010
## If you want to shutdown the cluster, please call:
xparl stop
And this is whatI get with status command:
xparl status
# Cluster localhost:8010 has 0 used cpus, 0 vacant cpus.
# If you want to check cluster status, please view: http://192.168.1.99:61721
Hi, thanks for your feedback. Can you provide more environment information?
- Python version
- parl version
- running terminal
Python 3.7.9 parl==1.3.2 running at command line
Hi, I cannot reproduce the error in the same running environment (win10, python3.7.9 and parl==1.3.2).
The error looks like the worker cannot start normally, can you try to run the command:
xparl connect --address 192.168.1.99:8010
after running the command xparl start --port 8010.
And tell us the error information.
I thought it was something about win10, as you couldn't reproduce the error I just cleaned up everything and reinstalled python and parl to same versions, now it's working. Thanks!
# Cluster localhost:8010 has 0 used cpus, 8 vacant cpus.
Glad to hear that. Feel free to reopen the issue if you have other problems:)
I have the issue again, but now I have narrowed down a little more:
Clean install of python + parl only, I can start, get status and stop many times, no issue
# Cluster localhost:8010 has 0 used cpus, 8 vacant cpus.
But then, after installing pytorch (tried 1.6.0 and 1.7.0):
# Cluster localhost:8010 has 0 used cpus, 0 vacant cpus.
Uninstalling pytorch, parl works again
# Cluster localhost:8010 has 0 used cpus, 8 vacant cpus.
Somehow pytorch is messing up parl, any ideas?
Hi, I cannot reproduce the error again. (I installed torch==1.7.0)
Maybe you can try to run the command: xparl connect --address 192.168.1.99:8010, and see what will happen.
Hi, I met the same question when running the alphago project in benchmark . Python 3.7.9 parl==1.3.2 torch==1.7.0(tried both cpu and gpu version) running at command line in ubuntu 18.04
xparl status
[09-10 15:53:24 MainThread @logger.py:224] Argv: /home/hxu/anaconda3/envs/parl/bin/xparl connect --address 192.168.70.105:8010 /home/hxu/anaconda3/envs/parl/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject return f(*args, **kwds) /home/hxu/anaconda3/envs/parl/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject return f(*args, **kwds)
(parl) hxu@hxu:~/netease/PARL/benchmark/torch/AlphaZero$ xparl connect --address 192.168.70.105:8010
[09-10 15:53:24 MainThread @logger.py:224] Argv: /home/hxu/anaconda3/envs/parl/bin/xparl connect --address 192.168.70.105:8010 /home/hxu/anaconda3/envs/parl/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject return f(*args, **kwds) /home/hxu/anaconda3/envs/parl/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject return f(*args, **kwds)
python main.py # in AlphaGo Dirs [09-10 15:53:34 MainThread @remote_decorator.py:178] WRN No vacant cpu resources at the moment, will try 300 times later.