ManiSkill Questions on Training Visuomotor Policies with RL

Questions on Training Visuomotor Policies with RL

Open SumeetBatra opened this issue 1 year ago • 1 comments

Hi all,

I have some compute constraints that limit the amount of image data I can work with, so I wanted to see if there are some ways I can save on compute while retaining performance. First question -- is it possible to just use the base_camera image to solve the task with any of the RL algorithms you have implemented? Second question -- is there a specific reason the image tensor's default height and width have been set to 128, or is it possible to work with lower dim images, say 64x64x4, without losing performance? Thanks!

Mar 27 '24 01:03 SumeetBatra

Unfortunately for some tasks the base_camera image is not sufficient and only one view might make the task too hard, even with depth information. I do not have baseline results off the top of my head at the moment but you will have to test lower image sizes and see if it can still solve the task. To change the image sizes you can pass in the custom camera configs during env creation.

Mar 28 '24 23:03 StoneT2000

Closing this issue as it has gone stale. Feel free to reopen if not.

Apr 30 '24 05:04 StoneT2000

ManiSkill ManiSkill copied to clipboard

Questions on Training Visuomotor Policies with RL

ManiSkill
ManiSkill copied to clipboard