DeepRL-TensorFlow2 issues

"PPO_Continuous.py" trained 1000 EP without effect

3

[No changes have been made to the code. tensorflow version is 2.2, will this affect it? ![20200610215857](https://user-images.githubusercontent.com/50161120/84277181-b9794300-ab65-11ea-955d-01957473a201.png)

Synmul

A3C_continues.py

'Viewer' object has no attribute 'isopen' File "E:\anaconda\envs\tf2\lib\site-packages\gym\envs\classic_control\rendering.py", line 81, in close AttributeError: 'Viewer' object has no attribute 'isopen' Traceback (most recent call last): File "E:\anaconda\envs\tf2\lib\site-packages\gym\envs\classic_control\rendering.py", line 165, in __del__...

Gjunze

Probelm in A3C continuous

Hello everyone. I am trying to use A3C continuous. But I am getting some error saying "unrecognized arguments". Please see the attached picture. ![image](https://user-images.githubusercontent.com/67465907/123241967-7cb28c80-d4d9-11eb-8f73-b1c6d5dac3fa.png) ![image](https://user-images.githubusercontent.com/67465907/123242016-86d48b00-d4d9-11eb-993d-4dcfeb96b5a2.png) How to solve this?

Miftahur92

Reward modification in PPO

2

https://github.com/marload/DeepRL-TensorFlow2/blob/876266d9a5fcf7d8a7c7e3afd8b110085b32b615/PPO/PPO_Discrete.py#L151-L154 https://github.com/marload/DeepRL-TensorFlow2/blob/876266d9a5fcf7d8a7c7e3afd8b110085b32b615/PPO/PPO_Continuous.py#L167-L170 In `PPO_Discrete` each reward is multiplied by `0.01` and in `PPO_Continuous` reward is also modified. I don't understand why do these modification, what does these modification do?

Ynjxsjmh

From_logit in A2C_discrete.py should be False

In the Actor net, It seems that from_logit should be set to False in tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True) as you added a softmax in the last layer:)

gzlong96

DeepRL-TensorFlow2
DeepRL-TensorFlow2 copied to clipboard

Metadata

"PPO_Continuous.py" trained 1000 EP without effect

A3C_continues.py

Probelm in A3C continuous

Reward modification in PPO

From_logit in A2C_discrete.py should be False

← Metadata

Owner

Metadata

DeepRL-TensorFlow2 DeepRL-TensorFlow2 copied to clipboard

Metadata

"PPO_Continuous.py" trained 1000 EP without effect

A3C_continues.py

Probelm in A3C continuous

Reward modification in PPO

From_logit in A2C_discrete.py should be False

← Metadata

Owner

Metadata

DeepRL-TensorFlow2
DeepRL-TensorFlow2 copied to clipboard