Youtube-Code-Repository issues

Results 48 Youtube-Code-Repository issues

Sort by recently updated

Remove obsolete params for SAC Agent constructor

Remove the two params from constructor layer1_size=256, layer2_size=256 since not used in any part of the code.

Faster Q value update for DDDQN - TF 2

Faster implementation for calculating q_target for training the DDDQN - in your video you mention the slow speed at which it runs. With this small change, it should run significantly...

maxzuo

Does not start your python code.

Hello, I'm trying to start your Lunar_lander code, 'main_torch_dqn_lundar_lander.py' in 'archive' folder, but it is not started. The following is the error. Thank you. File "C:\Users\ys-th\Desktop\Spring2022\main_torch_dqn_lunar_lander.py", line 15, in env...

rocketmaniaKim

magicSquares to self.magicSquares

`magicSquares[resultingState]` should be changed to `self.magicSquares[resultingState]` https://github.com/philtabor/Youtube-Code-Repository/blob/33b1e4b4aaa453c31307d6bd4bb05e37ff24cb2b/ReinforcementLearning/Fundamentals/gridworld.py#L63

amjadmajid

Rl code

Haiyao-Nero

Issue with critic target in PPO

In the [line used to define the returns](https://github.com/philtabor/Youtube-Code-Repository/blob/1ef76059bf55f7df9ccc09fce0e0bfb7c13e89bd/ReinforcementLearning/PolicyGradient/PPO/torch/ppo_torch.py#L186), we use the GAE + values as the target for the critic to learn. Is this correct? My intuition says no --...

davidireland3

Remove unused line

As mentioned in the video: https://youtu.be/LawaN3BdI00?t=1420 we don't need it.

bcollazo

SAC custom env

I get this error: --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in 26 score = 0 27 while not done: ---> 28 action = agent.choose_action(observation) 29 observation_, reward, done, info...

hn2

[TensorFlow2] Critic Loss Calculation for actor_critic

If I understand correctly, the code in [tensorflow2/actor_critic.py](https://github.com/philtabor/Youtube-Code-Repository/blob/master/ReinforcementLearning/PolicyGradient/actor_critic/tensorflow2/actor_critic.py) implements the `One-step Actor-Critic (episodic)` algorithm given on page 332 of RLbook2020 by Sutton/barto (picture given below). ![image](https://user-images.githubusercontent.com/24864163/151010488-98627635-31cc-406c-8664-fe0a8cac9350.png) Here we can see...

srihari-humbarwadi

Using observation space dimension size for the Actor and Critic models.

Hi, @philtabor ! Thanks for the awesome code and video! Those really help me to study and understand reinforcement learning. I found that the actor and critic model are not...

toohsk

Youtube-Code-Repository
Youtube-Code-Repository copied to clipboard

Metadata

Remove obsolete params for SAC Agent constructor

Faster Q value update for DDDQN - TF 2

Does not start your python code.

magicSquares to self.magicSquares

Rl code

Issue with critic target in PPO

Remove unused line

SAC custom env

[TensorFlow2] Critic Loss Calculation for actor_critic

Using observation space dimension size for the Actor and Critic models.

← Metadata

Owner

Metadata

Youtube-Code-Repository Youtube-Code-Repository copied to clipboard

Metadata

← Metadata

Owner

Metadata

Youtube-Code-Repository
Youtube-Code-Repository copied to clipboard