ManiSkill
ManiSkill copied to clipboard
Questions on Training Visuomotor Policies with RL
Hi all,
I have some compute constraints that limit the amount of image data I can work with, so I wanted to see if there are some ways I can save on compute while retaining performance. First question -- is it possible to just use the base_camera image to solve the task with any of the RL algorithms you have implemented? Second question -- is there a specific reason the image tensor's default height and width have been set to 128, or is it possible to work with lower dim images, say 64x64x4, without losing performance? Thanks!
Unfortunately for some tasks the base_camera image is not sufficient and only one view might make the task too hard, even with depth information. I do not have baseline results off the top of my head at the moment but you will have to test lower image sizes and see if it can still solve the task. To change the image sizes you can pass in the custom camera configs during env creation.
Closing this issue as it has gone stale. Feel free to reopen if not.