Sriram Krishna
Sriram Krishna
> Can you share a little more about your setup, how you build the env (are you using transforms?) the PPO loss and how you're using PPO and GAE +...
This seems to be a case of the right input not being passed. I tried training again with just `ee_pose` and `obj_goal` as the inputs to the policy. This performed...
- The `reset_obs` values match. Before seeding, the gym and torchrl observations do not match. After seeding both the envs, these match as well. - For the policy itself -...
I'm facing the same issue with 1.7.3 as well, the context object in kwargs is None
I was also able to get it working without problems. Was most likely facing some environment issues. I reinstalled everything in a fresh env and it worked like a charm....
Using the qvel directly without normalizing gives close to expected results. Even then, it doesn't exactly follow the same trajectory as the predicted actions, but it comes quite close. Related:...
Bump on this PR. Even though the teleop works, it would be good to know if the implementation is correct, especially since it doesn't work when I normalize the velocities.
Ready for review!
While running more experiments, I noticed that the training only went to NaN sometimes, on retrying a few times, the policy eventually converged. In this way I was able to...