ml-agents Feature Request: Standard API for handicapping trained agents by perturbing sensors and actuators

Feature Request: Standard API for handicapping trained agents by perturbing sensors and actuators

Open JesseTG opened this issue 3 years ago • 1 comments

Is your feature request related to a problem? Please describe. In my opinion, a good game AI plays on a similar level as the player. The use of machine learning suggests that you want it to play by more or less the same rules as a human, otherwise you could use traditional AI methods with careful level design or hardcoded buffs.

But what do you do about an AI that plays with the same constraints as a human but is still highly skilled? It could be very frustrating.

Describe the solution you'd like An API for handicapping trained agents by the occasional manipulation of ISensors or IActuators. Maybe this would come as a new IHandicap interface, or maybe it would be an additional abstract or virtual method for sensors and actuators. Or maybe it would just be some wrappers. I don't know. I'd like to hear your thoughts on the matter.

Describe alternatives you've considered Implementing my own architecture for handicapping trained agents. Depending on how smart my game's AI becomes, I might do this. More than anything, I'm opening this ticket to put this use case in writing.

Another option would be to train an AI for less time. That would work, but I can't imagine it would be easier to control.

Additional context People make mistakes when playing games, even if they're skilled and well-versed in the rules. I believe that many of these errors are not specific to any one game, and can therefore be simulated independently of any one sensor or actuator's implementation. Here are some ideas:

Human Error	Agent Handicap
Slow reaction time	Delay an actuator's response by a couple of frames
Failing to notice an important object	Temporarily give one or more sensors incorrect data
Misjudging the nature of an object (e.g. distance to an item)	Perturb one or more sensors
Pressing the wrong button on a controller	Perturb one or more discrete actions
Overwhelmed by lots of action (deer in headlights)	Randomize actuator output

Oct 01 '21 20:10 JesseTG

Hi @JesseTG

This is a problem that has long been on our radar and may be the focus of future projects. We appreciate your suggested solutions as they are aligned with some ideas that we have discussed internally. Thank you for raising this.

Oct 05 '21 21:10 andrewcoh

ml-agents ml-agents copied to clipboard

Feature Request: Standard API for handicapping trained agents by perturbing sensors and actuators

ml-agents
ml-agents copied to clipboard