mujoco icon indicating copy to clipboard operation
mujoco copied to clipboard

Different acceleration of mocap welded body joints between MuJoCo 2.3.7 and MuJoCo 3

Open rainx0r opened this issue 3 months ago • 10 comments

Hi,

I'm a PhD Student also working on maintaining the Metaworld multi-task / meta RL benchmark, which is built by simulating a Sawyer arm and various tasks in MuJoCo. Currently I'm trying to migrate the environment to MuJoCo 3, hoping to leverage MJX, which would be very beneficial in this setting.

Following the issue instructions, I'm attaching a minimal model that illustrates the problem as a .zip file (model.zip). I also have a minimal issue reproduction repo here.

In this MuJoCo model, in order to pass control signals from an RL agent or a scripted policy to the robot arm we use a mocap body which is welded to the end effector. Then we can set mjData.mocap_pos and mjData.mocap_quat (although the latter is always static in this specific case) to the desired position, simulate, and hopefully the robot arm moves accordingly. Here is a script using the Python bindings demonstrating what I mean (that can also be used to demonstrate the discrepancy across versions):

import mujoco  # type: ignore
import numpy as np

model = mujoco.MjModel.from_xml_path("./model.xml")
data = mujoco.MjData(model)
mujoco.mj_resetData(model, data)

HAND_INIT_POS = np.array((0, 0.6, 0.2))
HAND_INIT_QUAT = np.array([1, 0, 1, 0])
CTRL = np.array([-1, 1])
MOCAP_ID = model.body_mocapid[data.body("mocap").id]

data.mocap_pos[MOCAP_ID][:] = HAND_INIT_POS
data.mocap_quat[MOCAP_ID][:] = HAND_INIT_QUAT

mujoco.mj_step(model, data)

print("After 1 step:")
print("- qacc:", data.qacc[:])
print("- qvel:", data.qvel[:])
print("- qpos:", data.qpos[:])

With the exact same model and code, in MuJoCo 2.3.7 we get the following output:

After 1 step:
- qacc: [   804.10320363  -1246.53803675   2810.93304404   1363.60715965
 -10193.32350425   6161.43186826   7367.36584843     -0.39162863
     -0.39162863]
- qvel: [ 1.96310289 -1.24425176 -0.13583512  5.72689887 -6.44515445  3.77934942
  0.93719828 -0.00059986 -0.00059986]
- qpos: [ 0.00490776 -0.00311063 -0.00033959  0.01431725 -0.01611289  0.00944837
  0.002343   -0.0000015  -0.0000015 ]

And in MuJoCo 3.X (tested with MuJoCo 3.0.0 and 3.1.3, output was identical):

After 1 step:
- qacc: [  878.14637277 -1248.67353712  3086.88918714   854.47674222
 -4138.06110609 12384.86175066   818.44810668    -0.45816371
    -0.45816371]
- qvel: [ 2.17286831 -1.76721378  1.86603915  4.18106732 -2.08558898  7.15220456
  0.24056445 -0.0004012  -0.0004012 ]
- qpos: [ 0.00543217 -0.00441803  0.0046651   0.01045267 -0.00521397  0.01788051
  0.00060141 -0.000001   -0.000001  ]

Specifically, I notice the qacc is drastically different for joints 4, 5 and 6 (0-indexed), which belong to the bodies right_l4, right_l5 and right_l6 in the model, accordingly. After doing my best to patiently compare mjModel and MjData state and how it evolves across the two versions, this is the biggest difference I could find that could explain the problem we're seeing practically.

In practice, to initialise the arm position in the actual project we essentially do the same thing as in the script above, but with nstep=5 in the mj_step and do 50 such steps:

for _ in range(50):
    data.mocap_pos[MOCAP_ID][:] = HAND_INIT_POS
    data.mocap_quat[MOCAP_ID][:] = HAND_INIT_QUAT
    mujoco.mj_step(model, data, nstep=5)

In the image below demonstrating initial hand position, on the left is MuJoCo 2.3.7 and on the right is MuJoCo 3.0.0: Screenshot_2024-03-01_at_20

And videos visualising the difference:

MuJoCo 2.3.7 MuJoCo 2.3.7

MuJoCo 3.0.0 MuJoCo 3.0.0

Note that I tested with 3.0.0 mostly because I was trying to rule out the possibility that this was introduced in more recent versions. I have tested if any of the more recent versions fix this (3.1.2 and 3.1.3), but their behaviour is completely identical to 3.0.0. I have been using the Python bindings, tested on Python 3.9 and 3.11. Additionally, I have mostly been using MuJoCo 3 on my M1 Pro MacBook running macOS Sonoma 14.3.1, and MuJoCo 2.3.7 on a Debian 12 VM. I have also tested MuJoCo 3 on said Debian VM and also my x86 workstation running Arch Linux, but the behaviour is identical across all of them and it doesn't make a difference. So this doesn't seem to be a platform-specific issue.

I have tried looking for similar issues to no avail, and I have also looked through the change-log multiple times but I don't see any of the "breaking changes" or introduction of new features affecting this model. That said, admittedly I'm quite new to MuJoCo so maybe I'm missing something obvious.

Thank you for your time in advance, any pointers would be much appreciated.

rainx0r avatar Mar 07 '24 00:03 rainx0r

Hi, I tried loading your model but I only see the robot arm?

Anyway, do you have contacts in your example? If so, can you repeat the test without the margin parameter? Its behavior changed in 3.0 (see bugfix 33 in the changelog) so it's a possible candidate for your issue.

Also, I can't see your videos.

quagla avatar Mar 13 '24 12:03 quagla

Hi, thank you for looking into my issue!

Indeed, in the minimal model I've provided there is just the robot arm because the rest of the objects (like the table / cube etc) that are present in the real environment aren't actually needed to demonstrate the issue, so I removed them in accordance with the issue submission guidelines. However it was faster for me to take screenshots and record video in the real environment hence that discrepancy. But the issue is with the hand control itself, which is demonstrated by the minimal model and script.

Regarding the videos, my apologies. I am using Safari where they load fine but you are right that on Chrome they seem to not load. Here is a second attempt after re-encoding to .webm:

Thank you for pointing my attention to the geom margin changes, I missed that when reading the changelog. However, it doesn't seem to make a difference. After removing any margin definitions from the minimal model, the results are still the same.

rainx0r avatar Mar 13 '24 13:03 rainx0r

I now suspect this might be the result of bug fix 12 in 3.1.0. Could you please compare 3.0.1 and 3.1.0 to help us narrow it down?

Can you please make a minimal model (ideally with no meshes, you can ask MuJoCo to approximate meshes as primitives and save the XML), and also without any required Python code? You can add keyframes to you model which set the state. Such that you can say: load this model in these two versions, load keyframe 0 and see large differences in the trajectory.

Another reason to approximate the meshes as primitives, besides making our life a bit easier, is that is would disambiguate my hypothesis above that the change is due to changed inertias computed from meshes. If the hypothesis is true then after conversion to primitives the difference will disappear.

yuvaltassa avatar Mar 14 '24 14:03 yuvaltassa

I now suspect this might be the result of bug fix 12 in 3.1.0. Could you please compare 3.0.1 and 3.1.0 to help us narrow it down?

As per your suggestion I did just re-run the reproduction script on both 3.0.1 and 3.1.0 (versions I hadn't tested before) and their output in terms of qacc, qvel and qpos after a simulation step is identical to that of 3.0.0 and 3.1.2 (and different to that of 2.3.7). As in it is exactly the same output string as posted in my original post for 3.0.0.

This was no surprise, as I initially discovered the issue trying to upgrade from 2.3.7 -> 3.1.2. The reason I went down from 3.1.2 to 3.0.0 when testing was precisely to help with narrowing down which version exactly causes the difference in behaviour. Which seems to indeed be 3.0.0.

Can you please make a minimal model (ideally with no meshes, you can ask MuJoCo to approximate meshes as primitives and save the XML), and also without any required Python code? You can add keyframes to you model which set the state. Such that you can say: load this model in these two versions, load keyframe 0 and see large differences in the trajectory.

I am keen to attempt this but I am not sure how to ask MuJoCo to approximate the meshes? I looked through MuJoCo's API but I can't find any options for loadXML or similar functions that would do that, and neither can I see anything in the Simulate utility.

rainx0r avatar Mar 14 '24 15:03 rainx0r

Read this: https://mujoco.readthedocs.io/en/latest/XMLreference.html#compiler-fitaabb https://mujoco.readthedocs.io/en/latest/XMLreference.html#body-geom-mesh

yuvaltassa avatar Mar 17 '24 12:03 yuvaltassa

And/or read/load these: https://github.com/google-deepmind/mujoco/blob/main/test/user/testdata/fitmesh_aabb.xml https://github.com/google-deepmind/mujoco/blob/main/test/user/testdata/fitmesh_inertiabox.xml

yuvaltassa avatar Mar 17 '24 12:03 yuvaltassa

Corresponding image from the changelog: https://mujoco.readthedocs.io/en/latest/_images/meshfit.png

yuvaltassa avatar Mar 17 '24 13:03 yuvaltassa

Thank you for linking those resources, very informative! I didn't think to look into the XML for options regarding this.

Taking a closer look at the model trying to implement this "mesh fitting" and remove meshes, I realised that actually in this instance meshes are purely cosmetic for video rendering: they have no collision and really the relevant physics are done by hidden primitives. So, I simply removed all the meshes (and also other assets like textures), and the issue still seems to be the same.

This is the new "minimal" model
<mujoco>
    <option timestep='0.0025' iterations="50" tolerance="1e-10" solver="Newton" jacobian="dense" cone="elliptic"/>
    <compiler angle="radian" inertiafromgeom="auto" inertiagrouprange="4 5"/>

    <keyframe>
        <key name="issue_0" mpos="0 0.6 0.2" mquat="1 0 1 0"/>
        <key name="issue_1" mpos="0 0.6 0.2" mquat="1 0 1 0" time="0.125"/>
    </keyframe>

    <asset>
        <texture type="skybox" builtin="gradient" rgb1=".50 .495 .48" rgb2=".50 .495 .48" width="32" height="32"/>
        <material name="xyz_col" rgba="0.3 0.3 1.0 0.5" shininess="0" specular="0.5"/>
    </asset>

    <default>
      <default class="xyz_base">
          <joint armature="0.001" damping="2" limited="true"/>
          <position ctrllimited="true" ctrlrange="0 1.57"/>
          <default class="base_col">
              <geom conaffinity="1" condim="4" contype="1" group="4"  material="xyz_col" solimp=".8 .9 .01" solref=".02 1"/>
          </default>
      </default>
    </default>

    <worldbody>
        <light castshadow="false" directional='true' diffuse='.3 .3 .3' specular='0.3 0.3 0.3' pos='-1 -1 1'
               dir='1 1 -1'/>
        <light directional='true' diffuse='.3 .3 .3' specular='0.3 0.3 0.3' pos='1 -1 1' dir='-1 1 -1'/>
        <light castshadow="false" directional='true' diffuse='.3 .3 .3' specular='0.3 0.3 0.3' pos='0 1 1'
               dir='0 -1 -1'/>

        <geom name="floor" size="4 4 .1" pos="0 0 -.913" conaffinity="1" contype="1" type="plane"
              condim="3"/>

        <!-- Arm -->
        <body name="base" childclass="xyz_base" pos="0 0 0">
            <site name="basesite" pos="0 0 0" size="0.01" />
            <inertial pos="0 0 0" mass="0" diaginertia="0 0 0" />
            <body name="controller_box" pos="0 0 0">
                <inertial pos="-0.325 0 -0.38" mass="46.64" diaginertia="1.71363 1.27988 0.809981" />
                <geom size="0.11 0.2 0.265" pos="-0.325 0 -0.38" type="box" rgba="0.2 0.2 0.2 1"/>
            </body>
            <body name="pedestal_feet" pos="0 0 0">
                <inertial pos="-0.1225 0 -0.758" mass="167.09" diaginertia="8.16095 9.59375 15.0785" />
                <geom size="0.385 0.35 0.155" pos="-0.1225 0 -0.758" type="box" rgba="0.2 0.2 0.2 1"
                    contype="0"
                    conaffinity="0"
                />
            </body>
            <body name="torso" pos="0 0 0">
                <inertial pos="0 0 0" mass="0.0001" diaginertia="1e-08 1e-08 1e-08" />
                <geom size="0.05 0.05 0.05" type="box" contype="0" conaffinity="0" group="1" rgba="0.2 0.2 0.2 1" />
            </body>
            <body name="pedestal" pos="0 0 0">
                <inertial pos="0 0 0" quat="0.659267 -0.259505 -0.260945 0.655692" mass="60.864" diaginertia="6.0869 5.81635 4.20915" />
                <geom size="0.18 0.31" pos="-0.02 0 -0.29" type="cylinder" rgba="0.2 0.2 0.2 1" />
            </body>
            <body name="right_arm_base_link" pos="0 0 0">
                <inertial pos="-0.0006241 -2.8025e-05 0.065404" quat="-0.209285 0.674441 0.227335 0.670558" mass="2.0687" diaginertia="0.00740351 0.00681776 0.00672942" />
                <geom size="0.08 0.12" pos="0 0 0.12" type="cylinder" rgba="0.5 0.1 0.1 1" />
                <body name="right_l0" pos="0 0 0.08">
                    <inertial pos="0.024366 0.010969 0.14363" quat="0.894823 0.00899958 -0.170275 0.412573" mass="5.3213" diaginertia="0.0651588 0.0510944 0.0186218" />
                    <joint name="right_j0" pos="0 0 0" axis="0 0 1" limited="true" range="-3.0503 3.0503" damping="10"/>
                    <body name="head" pos="0 0 0.2965">
                        <inertial pos="0.0053207 -2.6549e-05 0.1021" quat="0.999993 7.08405e-05 -0.00359857 -0.000626247" mass="1.5795" diaginertia="0.0118334 0.00827089 0.00496574" />
                        <body name="screen" pos="0.03 0 0.105" quat="0.5 0.5 0.5 0.5">
                            <inertial pos="0 0 0" mass="0.0001" diaginertia="1e-08 1e-08 1e-08" />
                            <geom size="0.12 0.07 0.001" type="box" contype="0" conaffinity="0" group="1" rgba="0.2 0.2 0.2 1" />
                        </body>
                    </body>
                    <body name="right_torso_itb" pos="-0.055 0 0.22" quat="0.707107 0 -0.707107 0">
                        <inertial pos="0 0 0" mass="0.0001" diaginertia="1e-08 1e-08 1e-08" />
                    </body>
                    <body name="right_l1" pos="0.081 0.05 0.237" quat="0.5 -0.5 0.5 0.5">
                        <inertial pos="-0.0030849 -0.026811 0.092521" quat="0.424888 0.891987 0.132364 -0.0794296" mass="4.505" diaginertia="0.0224339 0.0221624 0.0097097" />
                        <joint name="right_j1" pos="0 0 0" axis="0 0 1"
                                limited="true" range="-3.8 -0.5"
                                damping="10"/>
                        <body name="right_l2" pos="0 -0.14 0.1425" quat="0.707107 0.707107 0 0">
                            <inertial pos="-0.00016044 -0.014967 0.13582" quat="0.707831 -0.0524761 0.0516007 0.702537" mass="1.745" diaginertia="0.0257928 0.025506 0.00292515" />
                            <joint name="right_j2" pos="0 0 0" axis="0 0 1" limited="true" range="-3.0426 3.0426" damping="10"/>
                            <geom size="0.06 0.17" pos="0 0 0.08" type="cylinder" rgba="0.5 0.1 0.1 1" />
                            <body name="right_l3" pos="0 -0.042 0.26" quat="0.707107 -0.707107 0 0">
                                <site name="armsite" pos="0 0 0" size="0.01" />
                                <inertial pos="-0.0048135 -0.0281 -0.084154" quat="0.902999 0.385391 -0.0880901 0.168247" mass="2.5097" diaginertia="0.0102404 0.0096997 0.00369622" />
                                <joint name="right_j3" pos="0 0 0" axis="0 0 1" limited="true" range="-3.0439 3.0439" damping="10"/>
                                <body name="right_l4" pos="0 -0.125 -0.1265" quat="0.707107 0.707107 0 0">
                                    <inertial pos="-0.0018844 0.0069001 0.1341" quat="0.803612 0.031257 -0.0298334 0.593582" mass="1.1136" diaginertia="0.0136549 0.0135493 0.00127353" />
                                    <joint name="right_j4" pos="0 0 0" axis="0 0 1" limited="true" range="-2.9761 2.9761" damping="10" />
                                    <geom size="0.045 0.15" pos="0 0 0.11" type="cylinder" rgba="0.5 0.1 0.1 1" />
                                    <body name="right_arm_itb" pos="-0.055 0 0.075" quat="0.707107 0 -0.707107 0">
                                        <inertial pos="0 0 0" mass="0.0001" diaginertia="1e-08 1e-08 1e-08" />
                                    </body>
                                    <body name="right_l5" pos="0 0.031 0.275" quat="0.707107 -0.707107 0 0">
                                        <inertial pos="0.0061133 -0.023697 0.076416" quat="0.404076 0.9135 0.0473125 0.00158335" mass="1.5625" diaginertia="0.00474131 0.00422857 0.00190672" />
                                        <joint name="right_j5" pos="0 0 0" axis="0 0 1" limited="true" range="-2.9761 2.9761" damping="10"/>
                                        <body name="right_wrist" pos="0 0 0.10541" quat="0.707107 0.707107 0 0">
                                            <inertial pos="0 0 0.10541" quat="0.707107 0.707107 0 0" mass="0" diaginertia="0 0 0" />
                                        </body>
                                        <body name="right_l6" pos="0 -0.11 0.1053" quat="0.0616248 0.06163 -0.704416 0.704416">
                                            <inertial pos="-8.0726e-06 0.0085838 -0.0049566" quat="0.479044 0.515636 -0.513069 0.491322" mass="0.3292" diaginertia="0.000360258 0.000311068 0.000214974" />
                                            <joint name="right_j6" pos="0 0 0" axis="0 0 1" limited="true" range="-4.7124 4.7124" damping="10"/>
                                            <geom size="0.055 0.025" pos="0 0.015 -0.01" type="cylinder" rgba="0.5 0.1 0.1 1" />
                                            <body name="right_hand" pos="0 0 0.0245" quat="0.707107 0 0 0.707107">
                                                <inertial pos="1e-08 1e-08 1e-08" quat="0.820473 0.339851 -0.17592 0.424708" mass="1e-08" diaginertia="1e-08 1e-08 1e-08" />

                                                <geom size="0.035 0.014" pos="0 0 0.015" type="cylinder" rgba="0 0 0 1"/>
                                                <body name="hand" pos="0 0 0.12" quat="-1 0 1 0">
                                                    <site name="endEffector" pos="0.04 0 0" size="0.01" rgba='1 1 1 1' />
                                                    <geom name="rail" type="box" pos="-0.05 0 0" density="7850" size="0.005 0.055 0.005"  rgba="0.5 0.5 0.5 1.0" condim="3" friction="2 0.1 0.002"   />

                                                    <body name="rightclaw" pos="0 -0.05 0" >

                                                        <geom class="base_col" name="rightclaw_it" condim="4"  type="box" user="0" pos="0 0 0" size="0.045 0.003 0.015"  rgba="1 1 1 1.0"   />

                                                        <joint name="r_close" pos="0 0 0" axis="0 1 0" range= "0 0.04" armature="100" damping="1000" limited="true"  type="slide"/>
                                                        <site name="rightEndEffector" pos="0.045 0 0" size="0.01" rgba="1.0 0.0 0.0 1.0"/>
                                                        <body name="rightpad" pos ="0 .003 0" >
                                                            <geom name="rightpad_geom" condim="4"  type="box" user="0" pos="0 0 0" size="0.045 0.003 0.015" rgba="1 1 1 1.0" solimp="0.95 0.99 0.01" solref="0.01 1" friction="2 0.1 0.002" contype="1" conaffinity="1" mass="1"/>
                                                        </body>

                                                    </body>

                                                    <body name="leftclaw" pos="0 0.05 0">
                                                        <geom class="base_col" name="leftclaw_it" condim="4"  type="box" user="0" pos="0 0 0" size="0.045 0.003 0.015"  rgba="0 1 1 1.0"  />
                                                        <joint name="l_close" pos="0 0 0" axis="0 1 0" range= "-0.03 0" armature="100" damping="1000" limited="true"  type="slide"/>
                                                        <site name="leftEndEffector" pos="0.045 0 0" size="0.01" rgba="1.0 0.0 0.0 1.0"/>
                                                        <body name="leftpad" pos ="0 -.003 0" >
                                                            <geom name="leftpad_geom" condim="4"  type="box" user="0" pos="0 0 0" size="0.045 0.003 0.015" rgba="0 1 1 1.0" solimp="0.95 0.99 0.01" solref="0.01 1" friction="2 0.1 0.002"  contype="1" conaffinity="1" />
                                                        </body>

                                                    </body>
                                                </body>
                                            </body>
                                        </body>
                                    </body>
                                    <body name="right_l4_2" pos="0 0 0">
                                        <inertial pos="1e-08 1e-08 1e-08" quat="0.820473 0.339851 -0.17592 0.424708" mass="1e-08" diaginertia="1e-08 1e-08 1e-08" />
                                    </body>
                                </body>
                            </body>
                            <body name="right_l2_2" pos="0 0 0">
                                <inertial pos="1e-08 1e-08 1e-08" quat="0.820473 0.339851 -0.17592 0.424708" mass="1e-08" diaginertia="1e-08 1e-08 1e-08" />
                            </body>
                        </body>
                        <body name="right_l1_2" pos="0 0 0">
                            <inertial pos="1e-08 1e-08 1e-08" quat="0.820473 0.339851 -0.17592 0.424708" mass="1e-08" diaginertia="1e-08 1e-08 1e-08" />
                            <geom size="0.07 0.07" pos="0 0 0.035" type="cylinder" rgba="0.2 0.2 0.2 1"/>
                        </body>
                    </body>
                </body>
            </body>
        </body>

        <body mocap="true" name="mocap" pos="0 0 0">
            <geom conaffinity="0" contype="0" pos="0 0 0" rgba="0.0 0.5 0.5 1" size="0.01" type="sphere"></geom>
            <site name="mocap" pos="0 0 0" rgba="0.0 0.5 0.5 1" size="0.01" type="sphere"></site>
        </body>
    </worldbody>

    <!-- Control -->
    <equality>
        <weld body1="mocap" body2="hand" solref="0.02 1"></weld>
    </equality>
</mujoco>

I also added a couple keyframes trying to emulate what I'm doing in the Python script: setting the mocap pos and quat at time 0 as a first keyframe, and keeping it there for 50 timesteps worth as a second keyframe. I could not really get this to visualise properly in the simulate utility, though, but maybe I'm missing something about visualising keyframes in that utility or something about keyframes in general.

Either way, even completely without any meshes, all the qacc, qvel and qpos values are still the same as before (and still different between 2.3.7 and 3.X)

rainx0r avatar Mar 19 '24 00:03 rainx0r

@yuvaltassa @quagla just wondering what the next steps are for this. @evangelos-ch and I work together on Meta-World and we are hoping to make a release in the near future. Thanks in advance.

reginald-mclean avatar Apr 03 '24 17:04 reginald-mclean