dhruvbatra
dhruvbatra
Cool to see that DDPPO training is getting in. Can we see a reward/success vs steps curve?
Interesting. Agreed that the untuned-reward policy indeed looks pretty bad. If you haven't already, consider chatting with @naokiyokoyama about his "gaze" policy.
Great! Glad to see this working. Agreed with the todos and look forward to them. Question -- what is the sensor suite for the spot-pick experiments? Just proprioception or proprioception+arm...
That's odd. If you give us a minimal reproducible example script (with details of the scene), we can look at reproducing it.
Can we get a github repository where we can actually look at code?
For MP3D, you can find some trajectories here: https://github.com/vincentcartillier/Semantic-MapNet
Thanks for finding this. Would you mind sending a PR to fix it? And @mpiseno -- perhaps you could review the PR?