Deep-Reinforcement-Learning-in-Large-Discrete-Action-Spaces issues

Results 5 Deep-Reinforcement-Learning-in-Large-Discrete-Action-Spaces issues

Sort by recently updated

target_action in Agent.py

Hi, in the Agent.py line 144 of ddpg, you use `state` to get target_action. I think it should be `state_2`. In the original ddpg.py of stevenpig's implementation, he also uses...

ydlu

How to deal with expotential action space?

Hello, I'm a new in "large action space". And I'm trying to do some work about large discrete action space. So will it work or could it be applied for...

hedongyan

why does the target action a' in Q(s', a') for training critic net directly come from the target actor net

Hi, jimkon In the original paper, the action for training critic net comes from the full policy. But, in your master, the action is just given by the target actor...

unrecall

assert(npts >= num_neighbors) AssertionError

When I change the k_ration in agrs to generate multiple actions, AssertionError shows as: Traceback (most recent call last): File "/Users/xx/Downloads/DROO-master/mec/rlmodel/LDAS/main.py", line 211, in train(args.train_iter, agent, env, evaluate, File "/Users/xx/Downloads/DROO-master/mec/rlmodel/LDAS/main.py",...

Reeseee

Handle the hyperparameters

Could not find how to adjust exploration rate, exploration-exploitation policy, discount rate, number of warm up steps etc. Please help me out!

sahasubhajit

Deep-Reinforcement-Learning-in-Large-Discrete-Action-Spaces
Deep-Reinforcement-Learning-in-Large-Discrete-Action-Spaces copied to clipboard

Metadata

target_action in Agent.py

How to deal with expotential action space?

why does the target action a' in Q(s', a') for training critic net directly come from the target actor net

assert(npts >= num_neighbors) AssertionError

Handle the hyperparameters

← Metadata

Owner

Metadata

Deep-Reinforcement-Learning-in-Large-Discrete-Action-Spaces Deep-Reinforcement-Learning-in-Large-Discrete-Action-Spaces copied to clipboard

Metadata

target_action in Agent.py

How to deal with expotential action space?

why does the target action a' in Q(s', a') for training critic net directly come from the target actor net

assert(npts >= num_neighbors) AssertionError

Handle the hyperparameters

← Metadata

Owner

Metadata

Deep-Reinforcement-Learning-in-Large-Discrete-Action-Spaces
Deep-Reinforcement-Learning-in-Large-Discrete-Action-Spaces copied to clipboard