language-table
language-table copied to clipboard
Why not using 3D data for training?
Hi,Thanks for your great job!
In manipulation task, depth camera is a very indispensable sensor and is easy to apply. The question is why not use this kind of camera in your project and training procedure?
Thanks for your attention. I am always waiting for your kind response
This is a good question.
Hi, thanks for the question. Overall we tried to minimize the number of modalities, and ended up being able to learn good policies using only RGB. Perhaps this is because the action space is constrained to 2d.
It's possible to experiment with this in sim by producing a dataset that also outputs a depth image, using the provided oracle policies.
It's possible to experiment with this in sim by producing a dataset that also outputs a depth image, using the provided oracle policies.
Hi, thanks for you kind response. What dose oracle(control) and script(lable) mean? (Please forgive my poor English terminology)
because CLIP only can use RGB?