Happy issues

Results 11 issues of


Happy

MADDPG-torch sample/predict action

add `predict` func, rewrite `sample` add run_eval_episode

Torch ppo

torch ppo for discrete action env (Atari games)

Mujoco version

There are a few detailed requirements about `Mujoco` version (e.g. mujoco-py==2.0.2.13 in ES, mujoco-py==2.0.2.8 in CQL, which need mujoco200). However, mujoco200 and below are no longer available.

## Add a env wrapper of gym environment. ### Add files: remote_gym_env_wrapper.py ---> PARL/parl/utils remote_gym_env_wrapper_test.py ---> PARL/parl/utils/tests ### Change file: PARL/parl/utils/__init__.py

No module named 'xparl_test'

### Case I tried to build a Remote Mujoco env wrapper, and write a class to pass env's data(mainly action_space & observation_space). But I got the following error. + Dependencies...

update requirements

xparl unit test fails

disable remote module testing in [PR939](https://github.com/PaddlePaddle/PARL/pull/939/commits/2741898602e0b9f216f2eed6c49a9438236c84fd)

ppo_mujocov2

CompatWrapper impact

Using [CompatWrapper](https://github.com/PaddlePaddle/PARL/blob/6c85147db7eb0bec4ac30980c71a42fe1f5b6057/parl/env/compat_wrappers.py#L95) causes convergence of algorithm to become slower, e.g. [TD3 in Humanoid](https://github.com/benchmarking-rl/PARL-experiments/pull/15/files?short_path=ebee0e2#diff-ebee0e282ca1cf6f7c49ffb6531ee917d22cb4b19c0a9a75c4e071820a5cb971).

Happy