pybullet-gym icon indicating copy to clipboard operation
pybullet-gym copied to clipboard

Are pretrained models MuJoCo compatible?

Open jendelel opened this issue 4 years ago • 6 comments

Hi,

I already had to model my environments in MuJoCo due to baseline algorithms that used it. As a proof of concept, I would like to use you pretrained Humanoid and let it run through my environment. I tried to model the observations based on your code, but didn't succeed so far. I can send my code if you want.

Do you think it such port is even possible? It seems that the actions are too large.

Thank you, Lukas

jendelel avatar Sep 24 '19 16:09 jendelel

Hi! That was the original intent of the reimplementations! From the observations and actions length point of view, they should be compatible actually. Checkout the HumanoidMuJoCoEnv-v0, the other is the roboschool one that is oddly different (no idea why they did that). In case they are not equal in length, tell me. Another issue I have for now is that I have no idea about the corresponding observations between mujoco and pybullet. Mujoco has a cryptic state vector of the environment and it is hard to find out what the openai guys did there. So I have a hard time to get the right observations.

I am very interested in pretrained agents for all openai gym mujoco envs btw such that I can test the similarity of my envs to the openai gym reference implementation. If you want to help me with any of this, that would be great!

benelot avatar Sep 25 '19 16:09 benelot

On the chance that the port works: I have no idea actually. For now it only worked with the pendulums to be honest. For now I am stuck with the observations such that I can not really tell yet.

benelot avatar Sep 25 '19 16:09 benelot

Hi,

Thanks a lot for your reply. By debugging step by step, I think I got the observations to be almost the same as in PyBullet the predicted joint torques were very similar. However, after the action went through the engine, I got quite different observations.

I guess I missed something with the actuators or MuJoCo just handles these things differently. In the end, the Humanoid fell and didn't get up.

We managed to port even most of the reward for easy Flagrun. After 2 days of training (using SAC) the humanoid seems to move in the right direction with only occasional falling.

I'd love to help you with pretrained checkpoints for MuJoCo humanoid, unfortunately I don't have access to much compute power.

Good luck with your project. I really like what you're doing.

Lukas

On Wed, Sep 25, 2019, 18:31 Benjamin Ellenberger [email protected] wrote:

On the chance that the port works: I have no idea actually. For now it only worked with the pendulums to be honest. For now I am stuck with the observations such that I can not really tell yet.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/benelot/pybullet-gym/issues/31?email_source=notifications&email_token=AD37XUN32X6RGKZBMD3657DQLOG6JA5CNFSM4I2B4UR2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7SQYHI#issuecomment-535104541, or mute the thread https://github.com/notifications/unsubscribe-auth/AD37XUOHILECHOLBDLTT2LDQLOG6JANCNFSM4I2B4URQ .

jendelel avatar Sep 29 '19 10:09 jendelel

Hi again,

so I managed to port the HumanoidFlugrun to MuJoCo. I would like to try to train the same ppo agent you trained for yours. However, the Tensorforce has evolved significantly and it looks like the API isn't the same. Could you tell me the command you used and the version of Tensorforce?

Thanks a lot.

Lukas

jendelel avatar Nov 11 '19 08:11 jendelel

Hi! Sorry for coming back to you so late. I do not remember what version I used unfortunately, since it was only a preliminary test if I can train them, and as others have mentioned, for some envs people do not manage to do it. I can not say exactly why yet, there is still some stuff to do on it.

On Mon, Nov 11, 2019 at 9:14 AM Lukas Jendele [email protected] wrote:

Hi again,

so I managed to port the HumanoidFlugrun to MuJoCo. I would like to try to train the same ppo agent you trained for yours. However, the Tensorforce has evolved significantly and it looks like the API isn't the same. Could you tell me the command you used and the version of Tensorforce?

Thanks a lot.

Lukas

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/benelot/pybullet-gym/issues/31?email_source=notifications&email_token=AAXXXKZU5SY6PLZU6RFTIILQTEH6JA5CNFSM4I2B4UR2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDWAD4I#issuecomment-552337905, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAXXXKY5LEW55GIEOJVYKZTQTEH6JANCNFSM4I2B4URQ .

benelot avatar Jan 20 '20 10:01 benelot

Hi all,

I'm currently working on making some bullet environments that are compatible with the OpenAI Mujoco ones. I started making my own before discovering this project.

I am very interested in pretrained agents for all openai gym mujoco envs btw such that I can test the similarity of my envs to the openai gym reference implementation. If you want to help me with any of this, that would be great!

I am currently conducting the same experiments basically, with both your environments and the ones I started. I have some polices that work well in mujoco, but I have not gotten them to transfer successfully yet. I've been focusing on walker2d first (for no particular reason) I see two big problems to the transfer.

  1. I think there is some mismatches going on in the state vectors. There are some things like the starting height being 0 in the bullet env vs 1.25 in the mujoco one, and I think the joint ordering is different between the two. This kind of stuff will be tedious but easy to fix.

  2. The physics are different between the two simulators. I've been using pybullets setPhysicsParameters and setDynamicsParameters to get as close as possible to Mujoco. the two sims have fundamentally different constraint models so I'm not sure how close we can get, or if being policy compatible is possible.

I'm not sure if any of you are working on this anymore (this thread is rather old now), but if so I'm happy to share what I have.

sgillen avatar May 26 '20 18:05 sgillen