DeepMimic icon indicating copy to clipboard operation
DeepMimic copied to clipboard

Agent Struggles to Balance / Bullet Simulation Singularity Issue ?

Open gaoalexander opened this issue 4 years ago • 1 comments

Hello @xbpeng @erwincoumans ,

Thank you for your amazing work.

I am training policies on new actions that have been retargeted from the Adobe Mixamo animation library. For certain actions that have less dynamic motion, i.e. the agent stands in one place and does something simple like raising their arm, the agent continually seems to struggle with standing upright for longer than 300-500 steps.

I would think that a behavior like balancing would be learned de facto, as a byproduct of maximizing its reward.

I decided to run a follow-up experiment in which I simply have the agent stand upright with no motion. Again, the agent is able to stand up for about 500 steps, then topples over. It seems like there is something strange happening with its foot, where after a certain amount of time, the policy suddenly produces behavior that applies a disproportionate amount of force to one foot, causing balance to quickly be lost.

Please reference the following video: Simple Standing Up PyBullet Visualization

Wondering if you have any ideas as to what might be going on here?

Thanks so much.

gaoalexander avatar Feb 01 '21 18:02 gaoalexander

I'm not very familiar with the pybullet implementation. But one guess could be that the time limit for each episode during training might be too short. So the character only ever needs to learn to stand for a few seconds and it can drift into some weird behaviors after that. It could also be an issue with the discount factor. The default discount of deepmimic is 0.95, which means that the policy has an effective horizon of about 20 steps. It could be that using a larger discount factor (e.g. 0.99) might help with a task like this, where the character has to plan more steps ahead in order to maintain balance.

In general, learning to stand still can actually be harder than walking. When the character is moving its feet, it's able to adjust the foot placements in order to maintain balance. But it can't do that when it's just standing still. So it is common that these static motions can be harder to learn than more dynamic ones. But I think with some tuning, these motions should also work.

xbpeng avatar Feb 06 '21 22:02 xbpeng