multiagent-particle-envs
multiagent-particle-envs copied to clipboard
NN code
According to the paper "our policies are parameterized by a two-layer ReLU MLP with 64 units per layer. To support discrete communication messages, we use the Gumbel-Softmax estimator [14]." However, I could not find it in the code! The policy is hardcoded (policy.py )based on the keyboard input, so what if my environment does not require input from the user
Appreciate explaining that point