Reinforcement-learning-with-tensorflow issues

Validating the trained model with a provided trajectory

This is more of a query rather than an issue. So Once the model is trained, how do I validate the model with some random trajectory I provide. So as...

rodhakate

Pytorch version of your code

Hi Morvan Zhou, Thanks for the great repo. [Here](https://github.com/tessavdheiden/joint_model/blob/main/train/policy_train_torch.py) is your code, but then in Pytorch. The results between our codes: Tensorflow (your code) ![reward_policy_tf](https://user-images.githubusercontent.com/24938569/109938220-11e20900-7cd0-11eb-9e5e-494f7eb08aa7.png) Pytorch (my code) ![reward_policy](https://user-images.githubusercontent.com/24938569/109938257-1c040780-7cd0-11eb-8dcd-41ee756f59ed.png)

tessavdheiden

What is the replace doing?

Hi! Could you please explain what the replace here does? https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow/blob/1fd1c08a6c8928d027fb75d51ebf6f9441e3dc33/experiments/Robot_arm/DDPG.py#L101 It somehow passes t_params and e_params to the session..

tessavdheiden

About Atari

3

There are many games in Atari games just like Breakout-v0, SpaceInvaders-ram-v0, and so on. How can I see these games' source code?

Precola

关于大迷宫（例如100x100）求解问题，适合什么强化学习算法？

2

教学中的迷宫规模都比较小，不复杂。如果想要求解大规模，如100*100的迷宫，且环境比较复杂的，应该选用什么强化学习算法？我试了几种算法，发现Q-learning貌似求出的不是最优解，而DQN的训练速度太慢，难以求得解。想请问下是什么原因导致的这些问题，随机策略选择还是其他参数设置的问题？或者有什么比较适合的强化学习算法嘛？求大神指导！谢谢！

TimDingg