chainerrl issues

TestTrainAgentAsync.test is flaky

``` =================================== FAILURES =================================== _____ TestTrainAgentAsync_param_1_{max_episode_len=None, num_envs=2}.test ______ self = def test(self): steps = 50 outdir = tempfile.mkdtemp() agent = mock.Mock() agent.shared_attributes = [] def _make_env(process_idx, test): env = mock.Mock()...

muupan

Fully Parameterized Quantile Function for Distributional Reinforcement Learning

https://arxiv.org/abs/1911.02140

muupan

enhancement

Support 'spawn' in async training

This is aimed to resolve #572 by making everything that is passed to subprocesses pickle-able. Still WIP.

muupan

Async training does not work with macOS and Python 3.8

``` % python examples/gym/train_a3c_gym.py 2 --env CartPole-v0 /Users/fujita-rental/.local/share/virtualenvs/chainerrl-TtJB_mwx/lib/python3.8/site-packages/chainer/_environment_check.py:33: UserWarning: Accelerate has been detected as a NumPy backend library. vecLib, which is a part of Accelerate, is known not to work...

muupan

bug

Compatibility with OpenAI Gym's VectorEnv

From 12.6 OpenAI Gym started to support VectorEnv. Since we have our own VectorEnv, we should check the compatibility and hopefully support theirs.

muupan

TestStatelessRecurrentSequential.test_n_step_forward_gpu is flaky

``` 2019-09-17 15:58:06.381228 STDOUT 2012] | chainerrl/tests/links_tests/test_stateless_recurrent_sequential.py .F... [100%] -- | -- 2019-09-17 15:58:06.381228 STDOUT 2012] | 2019-09-17 15:58:06.381229 STDOUT 2012] | =================================== FAILURES =================================== 2019-09-17 15:58:06.381230 STDOUT 2012] |...

muupan

TestCastObservation.test_cast_observation is flaky

``` ______________ TestCastObservation_param_0.test_cast_observation _______________ self = def test_cast_observation(self): env = chainerrl.wrappers.CastObservation( gym.make(self.env_id), dtype=self.dtype) rtol = 1e-3 if self.dtype == np.float16 else 1e-7 obs = env.reset() self.assertEqual(env.original_observation.dtype, np.float64) self.assertEqual(obs.dtype, self.dtype) >...

muupan