pybullet-gym icon indicating copy to clipboard operation
pybullet-gym copied to clipboard

AntMuJoCoEnv-v0 contains many unnecessary states.

Open seolhokim opened this issue 3 years ago • 13 comments

env = gym.make("AntMuJoCoEnv-v0")
env.reset()
for i in range(10):
    state,_,_,_ = env.step(env.action_space.sample())
print(state)
array([ 0.48600267, -0.02747473, -0.0488695 ,  0.55320414,  0.76180714,
       -0.46062074,  0.7321935 ,  0.43551934, -0.59654198,  0.01406484,
       -0.9421499 ,  0.35980088,  0.96942428,  0.34243858,  0.07390555,
       -0.01478774, -0.01189533,  0.00519861,  0.10619358, -0.02251298,
        0.04225551,  0.03821672, -0.03983905,  0.0030749 , -0.06877575,
        0.03366648,  0.06564061,  0.        ,  0.        ,  0.        ,
        0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
        0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
        0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
        0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
        0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
        0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
        0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
        0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
        0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
        0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
        0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
        0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
        0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
        0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
        0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
        0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
        0.        ])

I think we can remove many states after a certain index, what do you think?

seolhokim avatar Mar 27 '21 12:03 seolhokim

I was just about to ask about the root cause of this issue. This is the line:

https://github.com/benelot/pybullet-gym/blob/ec9e87459dd76d92fe3e59ee4417e5a665504f62/pybulletgym/envs/mujoco/robots/locomotors/ant.py#L21

I'm reasonably familiar with pybullet, but not at all with mujoco so if someone could tell me what cfrc_ext is actually referring to I wouldn't mind having a dig round to see if I could find it (no promises that I will find it though).

sash-a avatar Mar 28 '21 20:03 sash-a

Ok a quick update cfrc_ext refers to the contact points, I would imagine that it is in there for ground contact points, but the (14, 6) shape doesn't really make sense then as you would expect (x, 4) or (4, x) shape for the 4 feet. According to this thread those values are often, but not always 0 in mujoco also. So @seolhokim I wouldn't worry too much about it, it should definitely be left in so that the shape matches the mujoco envs, if you dig through the code you will find quite a lot of areas where the states values are simply set to 0 to maintain the same shape as the mujoco envs.

According to this thread it does seem possible to get the forces, at least in C, hopefully these methods are callable in python also. @benelot are you accepting pull requests? If so I will try and find something that will work. Also any idea as to why the shape is (14, 6)? My best guess is that the ant has 14 body parts and the contact force vector is of length 6, but I'm really not sure.

sash-a avatar Mar 28 '21 20:03 sash-a

When building all the envs, I firstly wanted them to comply with the observation sizes of the original mujoco envs. This is why I added 0s to the observation values I know to make it the same length. Then I intended to find all the missing observation values in mujoco to find the corresponding ones in pybullet, but I could not find any description of those at all!

Check it out here: https://github.com/openai/gym/blob/c8a659369d98706b3c98b84b80a34a832bbdc6c0/gym/envs/mujoco/ant.py#L35

If anybody finds out what crfc_ext is in mujoco and what it corresponds to in pybullet, I am happy to help you work all this out.

benelot avatar Mar 28 '21 20:03 benelot

I absolutely accept pull requests.

benelot avatar Mar 28 '21 20:03 benelot

Contact points? Maybe then it is similar to the ant in roboschool that I ported here as well. There is also some contact point related stuff there if I remember well.

benelot avatar Mar 28 '21 20:03 benelot

Btw, these missing observations are all over the place in the mujoco code.If you are eager, talented and interested to figure out what they are, I am really happy to help if I can, as I somehow was not able to figure it out at the time.

benelot avatar Mar 28 '21 20:03 benelot

Sure I'll definitely have a dig around for these contact points, can't commit to all the other missing values, but if I have some extra time I will look for them.

I do see foot contact points here

https://github.com/benelot/pybullet-gym/blob/ec9e87459dd76d92fe3e59ee4417e5a665504f62/pybulletgym/envs/mujoco/envs/locomotion/walker_base_env.py#L70-L78

But as this is only for 4 feet I don't think it would give us the correct shape. I think I would need to have a proper dig around the mujoco code to see why the array is the shape that it is, which I will do tomorrow or the next day :)

sash-a avatar Mar 28 '21 21:03 sash-a

@benelot

111-dim observation space

z (height) of the Torso -> 1

orientation (quarternion x,y,z,w) of the Torso -> 4

8 Joiint angles -> 8

3-dim directional velocity and 3-dim angular velocity -> 3+3=6

8 Joint velocity -> 8

External forces (force x,y,z + torque x,y,z) applied to the CoM of each link (Ant has 14 links: ground+torso+12(3links for 4legs) for legs -> (3+3)*(14)=84

I found this in https://enginius.tistory.com/734

Is it helpful to fill in the exact state?

seolhokim avatar Mar 30 '21 07:03 seolhokim

External forces (force x,y,z + torque x,y,z) applied to the CoM of each link (Ant has 14 links: ground+torso+12(3links for 4legs) for legs -> (3+3)*(14)=84

@seolhokim thanks, this is super helpful! I should have some time later today to see if I can work out how to get those external forces in pybullet :)

sash-a avatar Mar 30 '21 08:03 sash-a

@sash-a Oh I forgot to tag you. I believe you can do that! Thank you! :)

seolhokim avatar Mar 30 '21 10:03 seolhokim

Sounds great. In a future refactoring, these foot contact calculations could then go into a "mujoco layer" that is able to generate appropriate observations for all types of robots in environments such that replacing mujoco with open source is more achievable in the future.

@seolhokim The initial intent of this project was to replace Mujoco implemented openai gym envs with open source software. Therefore I try to reproduce everything this entails, thus also the exact/approximate observational state. We will not get to the level where we get the exact same responses to the same actions taken in both mujoco or pybullet, but I hope we get as close as to let mujoco trained agents run in pybullet and they achieve similar performance.

benelot avatar Mar 30 '21 12:03 benelot

I'm pretty sure I've found what we're looking for although it doesn't look like it's going to work. According to the docs we would need a torque sensor on the joints, which I think means we would have to modify the xml assets and I'm not too sure if that's a good idea.

The method is getJointState and one of it's outputs is jointReactionForces which according to the docs is "list of 6 floats | There are the joint reaction forces, if a torque sensor is enabled for this joint. Without torque sensor, it is [0,0,0,0,0,0]". I'm pretty sure this is what we are looking for, however I never found this to produce a value other than 0 which means that the joints don't have torque sensors. I think given this thread we can leave them as is, mujoco seems to have a similar problem.

If you want to have a look for yourself I put up a quick way to use this method in this commit ca7ab786af02286838325a71f3339e380e71cd7b

sash-a avatar Apr 01 '21 17:04 sash-a

The crfc_ext are the contact forces. But Mujoco 2.0 has issues with those and is just returning zeros (https://github.com/openai/gym/issues/1541)

GPaolo avatar May 18 '21 14:05 GPaolo