agents icon indicating copy to clipboard operation
agents copied to clipboard

AttributeError: 'NoneType' object has no attribute 'dumps' for tf_py_environment

Open vonadz opened this issue 4 years ago • 15 comments

OS: 5.6.15-arch1-1 (Arch Linux) Python: 3.8 Pip list:

absl-py 0.9.0
appdirs 1.4.4
astor 0.8.1
astroid 2.4.1
astunparse 1.6.3
backcall 0.1.0
beautifulsoup4 4.9.1
Brlapi 0.7.0
btrfsutil 1.2.0
CacheControl 0.12.6
cachetools 4.1.0
ceph-volume 1.0.0
cephfs 2.0.0
cephfs-shell 0.0.1
certifi 2020.4.5.2
chardet 3.0.4
cloudpickle 1.3.0
colorama 0.4.3
contextlib2 0.6.0.post1
cycler 0.10.0
decorator 4.4.2
distlib 0.3.0
distro 1.5.0
dm-tree 0.1.5
EasyProcess 0.3
future 0.18.2
gast 0.3.3
gin-config 0.3.0
google 2.0.3
google-auth 1.17.0
google-auth-oauthlib 0.4.1
google-pasta 0.2.0
grpcio 1.29.0
gym 0.17.2
h5py 2.10.0
html5lib 1.0.1
idna 2.9
imageio 2.8.0
ipython 7.15.0
ipython-genutils 0.2.0
isort 4.3.21
jedi 0.17.0
Jinja2 2.11.2
joblib 0.15.1
Keras 2.3.1
Keras-Applications 1.0.8
Keras-Preprocessing 1.1.2
kiwisolver 1.2.0
lazy-object-proxy 1.4.3
lensfun 0.3.95
louis 3.14.0
Markdown 3.2.2
MarkupSafe 1.1.1
matplotlib 3.2.1
mccabe 0.6.1
mock 4.0.2
msgpack 1.0.0
numpy 1.18.5
oauthlib 3.1.0
opt-einsum 3.2.1
ordered-set 3.1.1
packaging 20.4
pandas 1.0.4
parso 0.7.0
pep517 0.8.2
pexpect 4.8.0
pickleshare 0.7.5
Pillow 7.1.2
pip 20.0.2
progress 1.5
prompt-toolkit 3.0.5
protobuf 3.12.2
ptyprocess 0.6.0
pwquality 1.4.2
pyasn1 0.4.8
pyasn1-modules 0.2.8
pybind11 2.5.0
pycairo 1.19.1
pyglet 1.5.0
Pygments 2.6.1
PyGObject 3.36.1
pylint 2.5.2
PyOpenGL 3.1.5
pyparsing 2.4.7
python-dateutil 2.8.1
python-libtorrent 1.2.7
pytoml 0.1.21
pytz 2020.1
PyVirtualDisplay 1.3.2
pyxdg 0.26
PyYAML 5.3.1
rados 2.0.0
rbd 2.0.0
Reflector 2020.3.21.11.40.36 requests 2.23.0
requests-oauthlib 1.3.0
retrying 1.3.3
rgw 2.0.0
rsa 4.1
scikit-learn 0.23.1
scipy 1.4.1
setuptools 47.1.1
six 1.15.0
sklearn 0.0
soupsieve 2.0.1
tb-nightly 2.3.0a20200611
team 1.0
tensorboard-plugin-wit 1.6.0.post3
termcolor 1.1.0
tf-agents-nightly 0.6.0.dev20200611
tf-estimator-nightly 2.3.0.dev2020061101 tf-nightly 2.3.0.dev20200611
tfp-nightly 0.11.0.dev20200611 threadpoolctl 2.1.0
toml 0.10.1
torch 1.5.0
traitlets 4.3.3
urllib3 1.25.9
wcwidth 0.1.9
webencodings 0.5.1
Werkzeug 1.0.1
wheel 0.34.2
wrapt 1.12.1

Following this guide: https://www.tensorflow.org/agents/tutorials/1_dqn_tutorial

These lines: train_env = tf_py_environment.TFPyEnvironment(train_py_env) eval_env = tf_py_environment.TFPyEnvironment(eval_py_env)

Produce this error: Exception ignored in: <function Pool.__del__ at 0x7f4ba9686670> Traceback (most recent call last): File "/usr/lib/python3.8/multiprocessing/pool.py", line 268, in __del__ File "/usr/lib/python3.8/multiprocessing/queues.py", line 362, in put AttributeError: 'NoneType' object has no attribute 'dumps' Exception ignored in: <function Pool.__del__ at 0x7f4ba9686670> Traceback (most recent call last): File "/usr/lib/python3.8/multiprocessing/pool.py", line 268, in __del__ File "/usr/lib/python3.8/multiprocessing/queues.py", line 362, in put AttributeError: 'NoneType' object has no attribute 'dumps'

vonadz avatar Jun 11 '20 13:06 vonadz

This same error pops up with non-nightly builds, in a virtual env, on the same machine, with the following packages:

absl-py 0.9.0 astunparse 1.6.3 backcall 0.2.0 cachetools 4.1.0 certifi 2020.4.5.2 chardet 3.0.4 cloudpickle 1.4.1 cycler 0.10.0 decorator 4.4.2 EasyProcess 0.3 future 0.18.2 gast 0.3.3 gin-config 0.1.3 google-auth 1.17.1 google-auth-oauthlib 0.4.1 google-pasta 0.2.0 grpcio 1.29.0 gym 0.10.11 h5py 2.10.0 idna 2.9 imageio 2.4.0 ipython 7.15.0 ipython-genutils 0.2.0 jedi 0.17.0 Keras-Applications 1.0.8 Keras-Preprocessing 1.1.2 kiwisolver 1.2.0 Markdown 3.2.2 matplotlib 3.2.1 mock 4.0.2 numpy 1.18.5 oauthlib 3.1.0 opt-einsum 3.2.1 parso 0.7.0 pexpect 4.8.0 pickleshare 0.7.5 Pillow 7.1.2 pip 20.1.1 prompt-toolkit 3.0.5 protobuf 3.12.2 ptyprocess 0.6.0 pyasn1 0.4.8 pyasn1-modules 0.2.8 pyglet 1.3.2 Pygments 2.6.1 pyparsing 2.4.7 python-dateutil 2.8.1 PyVirtualDisplay 1.3.2 requests 2.23.0 requests-oauthlib 1.3.0 rsa 4.2 scipy 1.4.1 setuptools 41.2.0 six 1.15.0 tensorboard 2.2.2 tensorboard-plugin-wit 1.6.0.post3 tensorflow 2.2.0 tensorflow-estimator 2.2.0 tensorflow-probability 0.10.0 termcolor 1.1.0 tf-agents 0.5.0 traitlets 4.3.3 urllib3 1.25.9 wcwidth 0.2.4 Werkzeug 1.0.1 wheel 0.34.2 wrapt 1.12.1

vonadz avatar Jun 12 '20 00:06 vonadz

@vonadz facing the same issue with the same tutorial. Is it working on the nightly build though?

kamalojasv181 avatar Sep 05 '20 17:09 kamalojasv181

@vonadz facing the same issue with the same tutorial. Is it working on the nightly build though?

Not sure. I haven't checked on this for a while.

vonadz avatar Sep 05 '20 17:09 vonadz

I'm facing the same problem. When I run:

env = suite_gym.load('CartPole-v0')
env = tfa.environments.tf_py_environment.TFPyEnvironment(env)

I get:

2021-02-28 16:36:26.147133: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Exception ignored in: <function Pool.__del__ at 0x7f66b923b280>
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 268, in __del__
  File "/usr/lib/python3.8/multiprocessing/queues.py", line 362, in put
AttributeError: 'NoneType' object has no attribute 'dumps'

Process finished with exit code 0

My venv:

Keras-Preprocessing | 1.1.2 | 1.1.2
Markdown | 3.3.4 | 3.3.4
Pillow | 7.2.0 | 8.1.0
Werkzeug | 1.0.1 | 1.0.1
absl-py | 0.11.0 | 0.11.0
astunparse | 1.6.3 | 1.6.3
cachetools | 4.2.1 | 4.2.1
certifi | 2020.12.5 | 2020.12.5
chardet | 4.0.0 | 4.0.0
cloudpickle | 1.6.0 | 1.6.0
decorator | 4.4.2 | 4.4.2
dm-tree | 0.1.5 | 0.1.5
flatbuffers | 1.12 | 1.12
future | 0.18.2 | 0.18.2
gast | 0.3.3 | 0.4.0
gin-config | 0.4.0 | 0.4.0
google-auth | 1.27.0 | 1.27.0
google-auth-oauthlib | 0.4.2 | 0.4.2
google-pasta | 0.2.0 | 0.2.0
grpcio | 1.32.0 | 1.36.0
gym | 0.18.0 | 0.18.0
h5py | 2.10.0 | 3.1.0
idna | 2.10 | 3.1
numpy | 1.19.5 | 1.20.1
oauthlib | 3.1.0 | 3.1.0
opt-einsum | 3.3.0 | 3.3.0
pip | 21.0 | 21.0.1
protobuf | 3.15.3 | 3.15.3
pyasn1 | 0.4.8 | 0.4.8
pyasn1-modules | 0.2.8 | 0.2.8
pyglet | 1.5.0 | 1.5.15
requests | 2.25.1 | 2.25.1
requests-oauthlib | 1.3.0 | 1.3.0
rsa | 4.7.2 | 4.7.2
scipy | 1.6.1 | 1.6.1
setuptools | 52.0.0 | 54.0.0
six | 1.15.0 | 1.15.0
tensorboard | 2.4.1 | 2.4.1
tensorboard-plugin-wit | 1.8.0 | 1.8.0
tensorflow | 2.4.1 | 2.4.1
tensorflow-estimator | 2.4.0 | 2.4.0
tensorflow-probability | 0.12.1 | 0.12.1
termcolor | 1.1.0 | 1.1.0
tf-agents | 0.7.1 | 0.7.1
typing-extensions | 3.7.4.3 | 3.7.4.3
urllib3 | 1.26.3 | 1.26.3
wheel | 0.36.2 | 0.36.2
wrapt | 1.12.1 | 1.12.1

JacobFV avatar Feb 28 '21 22:02 JacobFV

I have the same issue.

Traceback (most recent call last):
  File "/home/piotr/projects/fits/ai/dqn_tutorial.py", line 118, in <module>
    fc_layer_params = (100, 50)
KeyboardInterrupt
Exception ignored in: <function Pool.__del__ at 0x7f45c0c7f160>
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 268, in __del__
  File "/usr/lib/python3.8/multiprocessing/queues.py", line 362, in put
AttributeError: 'NoneType' object has no attribute 'dumps'
Exception ignored in: <function Pool.__del__ at 0x7f45c0c7f160>
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 268, in __del__
  File "/usr/lib/python3.8/multiprocessing/queues.py", line 362, in put
AttributeError: 'NoneType' object has no attribute 'dumps'

svagier avatar Mar 04 '21 20:03 svagier

Have the same issue. It runs fine in jupyter notebook. But converting it to python script and run, it generates the error.

git4sun avatar Apr 05 '21 03:04 git4sun

I received the same issue, if anyone has resolved the issue please share.

Nafees-060 avatar Apr 17 '21 00:04 Nafees-060

Can you help us debug by running in pdb and including the full backtrace?

ebrevdo avatar Apr 17 '21 02:04 ebrevdo

I received the same issue, if anyone has resolved the issue please share.

In my case, if I ended the code in train_env = tf_py_environment.TFPyEnvironment(train_py_env), it will throw this exception. But if I created the agent and ran the session, it didn't throw any exceptions. I guess it is lazy loaded. If you didn't create a session in the code, it throws exception. If you create a session, the line is executed in the session.

git4sun avatar Apr 18 '21 21:04 git4sun

@git4sun please provide a repro example. i especially want to know what you set the isolation flag to.

ebrevdo avatar Apr 18 '21 23:04 ebrevdo

Hi, @ebrevdo and everyone,

I think I find out the location where is bug happens. First you need to set isolation = True to create a ThreadPool.

Then at the very last of your script, add train_env.close().

BUT, BUT, this is not the end of story,

TFPyEnvironment.close methods call self._pool.join() first then self._pool.close() which raise another error: Pool is still running. https://github.com/tensorflow/agents/blob/74cff73d8d985bbf08d7d2dac1b675e3c4bd82c2/tf_agents/environments/tf_py_environment.py#L200-L203

To sove this problem, simply correct TFPyEnvironment.close method with

self._pool.close() first, then self._pool.join()

reference: https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool.join

You are good to go~~~

QuantHao avatar May 17 '21 12:05 QuantHao

@oars @summer-yue @sguada Hi, I saw you two are currently activately mantain this repo, should I report this issue to you? Thanks.

QuantHao avatar May 24 '21 06:05 QuantHao

@QuantHao thanks for the pointer. I was experiencing this issue while trying to run parallel TF-Agents drivers using mpi4py. Just adding env.close() at the end of my script solved it for me, I did not need to set env.isolation = True.

samarth-robo avatar Jul 02 '21 20:07 samarth-robo

ISSUE FIXED put the code with distributed training code to a separate method and call it inside the main method. Git hub Code

SaralaSewwandi avatar Oct 05 '21 15:10 SaralaSewwandi

Hi, @ebrevdo and everyone,

我想我找到了错误发生的位置。首先,您需要设置 isolation = True 以创建线程池。

然后在脚本的最后一个,添加 train_env.close()。

但是,但是,这不是故事的结局,

TFPyEnvironment.close 方法首先调用 self._pool.join(),然后调用 self._pool.close(),这引发了另一个错误:Pool 仍在运行。

https://github.com/tensorflow/agents/blob/74cff73d8d985bbf08d7d2dac1b675e3c4bd82c2/tf_agents/environments/tf_py_environment.py#L200-L203

To sove this problem, simply correct TFPyEnvironment.close method with

self._pool.close() first, then self._pool.join()

reference: https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool.join

You are good to go~~~

Sorry to bother you. I have set self._pool.close() first and then self._pool.join(), but the error still occured. Exception ignored in: <function Pool.del at 0x7effb2de58b0> Traceback (most recent call last):

Hi, @ebrevdo and everyone,

我想我找到了错误发生的位置。首先,您需要设置 isolation = True 以创建线程池。

然后在脚本的最后一个,添加 train_env.close()。

但是,但是,这不是故事的结局,

TFPyEnvironment.close 方法首先调用 self._pool.join(),然后调用 self._pool.close(),这引发了另一个错误:Pool 仍在运行。

https://github.com/tensorflow/agents/blob/74cff73d8d985bbf08d7d2dac1b675e3c4bd82c2/tf_agents/environments/tf_py_environment.py#L200-L203

To sove this problem, simply correct TFPyEnvironment.close method with

self._pool.close() first, then self._pool.join()

reference: https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool.join

You are good to go~~~

Sorry to bother you. I have set self._pool.close() first and then self._pool.join(), but the error still occurred.

Exception ignored in: <function Pool.__del__ at 0x7effb2de58b0>
Traceback (most recent call last):
  File "/home/XXX/anaconda3/envs/XXX/lib/python3.8/multiprocessing/pool.py", line 268, in __del__
   File "/home/XXX/anaconda3/envs/XXX/lib/python3.8/multiprocessing/queues.py", line 362, in put
AttributeError: 'NoneType' object has no attribute 'dumps'

feifei-feifei-hub avatar Jun 20 '23 02:06 feifei-feifei-hub