pybullet-gym
pybullet-gym copied to clipboard
AntMuJoCoEnv-v0 contains many unnecessary states.
env = gym.make("AntMuJoCoEnv-v0")
env.reset()
for i in range(10):
state,_,_,_ = env.step(env.action_space.sample())
print(state)
array([ 0.48600267, -0.02747473, -0.0488695 , 0.55320414, 0.76180714,
-0.46062074, 0.7321935 , 0.43551934, -0.59654198, 0.01406484,
-0.9421499 , 0.35980088, 0.96942428, 0.34243858, 0.07390555,
-0.01478774, -0.01189533, 0.00519861, 0.10619358, -0.02251298,
0.04225551, 0.03821672, -0.03983905, 0.0030749 , -0.06877575,
0.03366648, 0.06564061, 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. ])
I think we can remove many states after a certain index, what do you think?
I was just about to ask about the root cause of this issue. This is the line:
https://github.com/benelot/pybullet-gym/blob/ec9e87459dd76d92fe3e59ee4417e5a665504f62/pybulletgym/envs/mujoco/robots/locomotors/ant.py#L21
I'm reasonably familiar with pybullet, but not at all with mujoco so if someone could tell me what cfrc_ext
is actually referring to I wouldn't mind having a dig round to see if I could find it (no promises that I will find it though).
Ok a quick update cfrc_ext
refers to the contact points, I would imagine that it is in there for ground contact points, but the (14, 6) shape doesn't really make sense then as you would expect (x, 4) or (4, x) shape for the 4 feet. According to this thread those values are often, but not always 0 in mujoco also. So @seolhokim I wouldn't worry too much about it, it should definitely be left in so that the shape matches the mujoco envs, if you dig through the code you will find quite a lot of areas where the states values are simply set to 0 to maintain the same shape as the mujoco envs.
According to this thread it does seem possible to get the forces, at least in C, hopefully these methods are callable in python also. @benelot are you accepting pull requests? If so I will try and find something that will work. Also any idea as to why the shape is (14, 6)? My best guess is that the ant has 14 body parts and the contact force vector is of length 6, but I'm really not sure.
When building all the envs, I firstly wanted them to comply with the observation sizes of the original mujoco envs. This is why I added 0s to the observation values I know to make it the same length. Then I intended to find all the missing observation values in mujoco to find the corresponding ones in pybullet, but I could not find any description of those at all!
Check it out here: https://github.com/openai/gym/blob/c8a659369d98706b3c98b84b80a34a832bbdc6c0/gym/envs/mujoco/ant.py#L35
If anybody finds out what crfc_ext is in mujoco and what it corresponds to in pybullet, I am happy to help you work all this out.
I absolutely accept pull requests.
Contact points? Maybe then it is similar to the ant in roboschool that I ported here as well. There is also some contact point related stuff there if I remember well.
Btw, these missing observations are all over the place in the mujoco code.If you are eager, talented and interested to figure out what they are, I am really happy to help if I can, as I somehow was not able to figure it out at the time.
Sure I'll definitely have a dig around for these contact points, can't commit to all the other missing values, but if I have some extra time I will look for them.
I do see foot contact points here
https://github.com/benelot/pybullet-gym/blob/ec9e87459dd76d92fe3e59ee4417e5a665504f62/pybulletgym/envs/mujoco/envs/locomotion/walker_base_env.py#L70-L78
But as this is only for 4 feet I don't think it would give us the correct shape. I think I would need to have a proper dig around the mujoco code to see why the array is the shape that it is, which I will do tomorrow or the next day :)
@benelot
111-dim observation space
z (height) of the Torso -> 1
orientation (quarternion x,y,z,w) of the Torso -> 4
8 Joiint angles -> 8
3-dim directional velocity and 3-dim angular velocity -> 3+3=6
8 Joint velocity -> 8
External forces (force x,y,z + torque x,y,z) applied to the CoM of each link (Ant has 14 links: ground+torso+12(3links for 4legs) for legs -> (3+3)*(14)=84
I found this in https://enginius.tistory.com/734
Is it helpful to fill in the exact state?
External forces (force x,y,z + torque x,y,z) applied to the CoM of each link (Ant has 14 links: ground+torso+12(3links for 4legs) for legs -> (3+3)*(14)=84
@seolhokim thanks, this is super helpful! I should have some time later today to see if I can work out how to get those external forces in pybullet :)
@sash-a Oh I forgot to tag you. I believe you can do that! Thank you! :)
Sounds great. In a future refactoring, these foot contact calculations could then go into a "mujoco layer" that is able to generate appropriate observations for all types of robots in environments such that replacing mujoco with open source is more achievable in the future.
@seolhokim The initial intent of this project was to replace Mujoco implemented openai gym envs with open source software. Therefore I try to reproduce everything this entails, thus also the exact/approximate observational state. We will not get to the level where we get the exact same responses to the same actions taken in both mujoco or pybullet, but I hope we get as close as to let mujoco trained agents run in pybullet and they achieve similar performance.
I'm pretty sure I've found what we're looking for although it doesn't look like it's going to work. According to the docs we would need a torque sensor on the joints, which I think means we would have to modify the xml assets and I'm not too sure if that's a good idea.
The method is getJointState
and one of it's outputs is jointReactionForces
which according to the docs is "list of 6 floats | There are the joint reaction forces, if a torque sensor is enabled for this joint. Without torque sensor, it is [0,0,0,0,0,0]". I'm pretty sure this is what we are looking for, however I never found this to produce a value other than 0 which means that the joints don't have torque sensors. I think given this thread we can leave them as is, mujoco seems to have a similar problem.
If you want to have a look for yourself I put up a quick way to use this method in this commit ca7ab786af02286838325a71f3339e380e71cd7b
The crfc_ext are the contact forces. But Mujoco 2.0 has issues with those and is just returning zeros (https://github.com/openai/gym/issues/1541)