Reinforcement-learning-with-tensorflow issues

计算机资源利用率低

我在服务器上跑REINFORCE的代码，我发现gpu的利用率差不多只有30%，而cpu的利用率没有拉满，只有不到80%，这是什么原因导致的啊，代码里面没有写日志的io操作。这是REINFORCE本身的问题吗，还是其他原因导致的啊。感谢！

treasure on right例子中的程序报错

大家有遇到过在跑样例的过程中出现keyerror的报错吗 ![微信图片_20220508150508](https://user-images.githubusercontent.com/84264941/167285672-0e56080a-929c-44f6-9f54-16a8eb2e687c.png)

xiaohu-art

DQN的代码中，计算q_target时未考虑done为true的情况

1

请问Morvan, DQN的代码中，计算q_target时，是否未考虑done为True的情况，即q_target = Reward? 存储在Replay memory中的经验也未包含done。请问为什么呢？

ananasfl

Curiosity algorithm

i want to learn curiosity algoritm .So any example u have?. any explanation?. How can i implement it in ddpg algorithm? Anyone add it for example 2d car work.

lamare3423

请问如何在tensorboard中展示DDPG reward值的变化趋势？

我尝试在[代码DDPG_update2.py](https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow/blob/master/contents/9_Deep_Deterministic_Policy_Gradient_DDPG/DDPG_update2.py)中将ep_reward(line: 146)纳入tensorboard监测范畴，但是由于此变量不在计算图中，所以我没找到合适的方法在tensorboard的web界面中实时展示其在训练过程中的变化趋势，想请教下大家有没有好的办法？

thingsareright