safeRL
safeRL copied to clipboard
Recovery policy code only supports single instance rollout and training
In the following line, reward is specified as a list with a single element. https://github.com/hari-sikchi/safeRL/blob/b4f0443b109d5d3290771528115087eb5dd763ce/safe_recovery/agent2.py#L352-L354
For multiple parallel rollouts, this list should be filled up asynchronously by different instances of simulation