agents
agents copied to clipboard
action constraints in PPO
Hi, what is the best way to implement action constraints in a PPOAgent?
For a QPolicy
i can use observation_and_action_constraint_splitter
. Is there something equivalent for ppo policies?
For an environment that has discrete actions you could do a similar pattern. In the case of continuous actions it gets a bit more tricky, you could use truncated normals for the distribution for example.
#216 - for continuous actions
i have a discrete pattern case. but i was wondering about the technical part, meaning is there a feature that implements the observation_and_action_constraint_splitter
functionality for PPOAgents?
Given that you added the label, i guess not yet.
Hello @niklasnolte, I want to train my PPO agent on an custom environment with discrete action space, while its performance will be compared with DQN agent. Have you figured out how to apply action constraints on PPO agent? Thanks a lot!
edit: For anyone who encountered the same problem, I found a possible solution in #452.