language-table icon indicating copy to clipboard operation
language-table copied to clipboard

Why not using 3D data for training?

Open oym1994 opened this issue 1 year ago • 5 comments

Hi,Thanks for your great job!

In manipulation task, depth camera is a very indispensable sensor and is easy to apply. The question is why not use this kind of camera in your project and training procedure?

Thanks for your attention. I am always waiting for your kind response

oym1994 avatar Mar 25 '23 03:03 oym1994

This is a good question.

yangjiangeyjg avatar Mar 25 '23 16:03 yangjiangeyjg

Hi, thanks for the question. Overall we tried to minimize the number of modalities, and ended up being able to learn good policies using only RGB. Perhaps this is because the action space is constrained to 2d.

ayzaan avatar Mar 25 '23 18:03 ayzaan

It's possible to experiment with this in sim by producing a dataset that also outputs a depth image, using the provided oracle policies.

ayzaan avatar Mar 25 '23 18:03 ayzaan

It's possible to experiment with this in sim by producing a dataset that also outputs a depth image, using the provided oracle policies.

Hi, thanks for you kind response. What dose oracle(control) and script(lable) mean? (Please forgive my poor English terminology)

oym1994 avatar Apr 03 '23 12:04 oym1994

because CLIP only can use RGB?

Bailey-24 avatar May 10 '23 03:05 Bailey-24