ml-agents
ml-agents copied to clipboard
Feature Request: Standard API for handicapping trained agents by perturbing sensors and actuators
Is your feature request related to a problem? Please describe. In my opinion, a good game AI plays on a similar level as the player. The use of machine learning suggests that you want it to play by more or less the same rules as a human, otherwise you could use traditional AI methods with careful level design or hardcoded buffs.
But what do you do about an AI that plays with the same constraints as a human but is still highly skilled? It could be very frustrating.
Describe the solution you'd like
An API for handicapping trained agents by the occasional manipulation of ISensor
s or IActuator
s. Maybe this would come as a new IHandicap
interface, or maybe it would be an additional abstract
or virtual
method for sensors and actuators. Or maybe it would just be some wrappers. I don't know. I'd like to hear your thoughts on the matter.
Describe alternatives you've considered Implementing my own architecture for handicapping trained agents. Depending on how smart my game's AI becomes, I might do this. More than anything, I'm opening this ticket to put this use case in writing.
Another option would be to train an AI for less time. That would work, but I can't imagine it would be easier to control.
Additional context People make mistakes when playing games, even if they're skilled and well-versed in the rules. I believe that many of these errors are not specific to any one game, and can therefore be simulated independently of any one sensor or actuator's implementation. Here are some ideas:
Human Error | Agent Handicap |
---|---|
Slow reaction time | Delay an actuator's response by a couple of frames |
Failing to notice an important object | Temporarily give one or more sensors incorrect data |
Misjudging the nature of an object (e.g. distance to an item) | Perturb one or more sensors |
Pressing the wrong button on a controller | Perturb one or more discrete actions |
Overwhelmed by lots of action (deer in headlights) | Randomize actuator output |
Hi @JesseTG
This is a problem that has long been on our radar and may be the focus of future projects. We appreciate your suggested solutions as they are aligned with some ideas that we have discussed internally. Thank you for raising this.