[Feature Request] Add Maniskill3 Custom Env
Motivation
Maniskill3 is gaining a lot of taction recently and it offers great features like parallel GPU vectorized environments, a great set of tasks from simpler to complex and it is quite performant.
You can find more about it here
Currently it is not possible to use Maniskill3 with Gym wrappers since Maniskill3 actually returns torch tensors directly (very nice) but the TorchRL gym wrappers epxect numpy arrays.
There is also some custom logic needed to make Maniskill3 work therefore I do not think a simple Gym wrapper would be enough and in this case a custom env is probably more appropriate.
Solution
I would like to have a Maniskill3 env available as custom TorchRL env.
Alternatives
I actually have a custom env already, this is what I currently use but it needs to be cleaned up to fit everybody's needs and be cleaner for TorchRL repo.
Checklist
- [x] I have checked that there is no similar issue in the repo (required)
Cuda support is currently not possible if we have any metrics logging code (which would require .to("cpu")), see : https://github.com/pytorch/rl/issues/2644#issuecomment-2625706891
Thanks for this Alexandre
Currently it is not possible to use Maniskill3 with Gym wrappers since Maniskill3 actually returns torch tensors directly (very nice) but the TorchRL gym wrappers epxect numpy arrays.
Funny though, torchrl doesn't support gym envs that return tensors :p
How are the Maniskill spaces? Native gym ones? Or specialized for PyTorch? I'm asking bc I just landed a tool to register custom conversion - perhaps we could have tensor-to-tensor space conversion or smth like that (ie, type the spec as being numpy or torch-bound if that's an issue)
There is also some custom logic needed to make Maniskill3 work therefore I do not think a simple Gym wrapper would be enough and in this case a custom env is probably more appropriate.
Is it about auto-resets or broader than that?
How are the Maniskill spaces? Native gym ones? Or specialized for PyTorch?
They are native gym ones so I had to convert the dtype, ex:
states_spec = Unbounded(
shape=state_spec_shape,
device=self._device,
dtype=self.numpy_to_torch_dtype_dict[state_spec.dtype.type],
)
Essentially the specs are numpy but the data during step/reset is pytorch so you need to define the TorchRL specs as torch so that you have torch-torch (specs = torch, data returned = torch).
perhaps we could have tensor-to-tensor space conversion or smth like that
This might work but I had to change the shapes so that it has a batch size of torch.Size([]) when using num_envs=1 and otherwise the batch_size is torch.Size([num_envs]).
By default Maniskill always returns data in a batch (eg: num_envs = 1 => batch_size of torch.Size([1])).
Same for the action space.
Is it about auto-resets or broader than that?
The seeding needs to account for num_envs > 1.
I had to use VecGymEnvTransform(final_name="final_observation") but I also needed to make sure my _reset does not re-reset the env (eg: env.step_and_maybe_reset() that is being called by TorchRL internally).
The code is fairly messy right now, not happy about it tbh :
https://gist.github.com/AlexandreBrown/0df62a6c5653ac961d11734984867756
The gist is :
- Create gym env using
gym.make - Wrap the gym env in
ManiSkillVectorEnvso that it now acts like a standard gym vect env. - Wrap the
ManiSkillVectorEnvenv in torchrl custom env and make everyone happy (TorchRL, Maniskill and Gym).
If you have a dirty script to share we can work on that! Doesn't need to be a fully fleshed PR
@vmoens You can find the current implementation here : https://gist.github.com/AlexandreBrown/0df62a6c5653ac961d11734984867756
In practice a lot of boilerplate could be removed if we infer the observation_spec automatically.