agenthive icon indicating copy to clipboard operation
agenthive copied to clipboard

Clarification on evaluation metric

Open gunshi opened this issue 2 years ago • 1 comments

Hi, Thanks for open-sourcing this framework! I'm trying to reproduce the results of the baselines reported in the Robohive paper, and wanted to ask what is the exact metric that is averaged over 3 seeds in the Franka-expert data runs (here: https://github.com/facebookresearch/agenthive/tree/dev/scripts)? Is it the maximum success rate over a run averaged over 3 seeds or the maximum of the average success rate over 3 seeds or something else? The paper doesn't seem to mention exactly how the success rate of a run is decided (over many checkpoints). Thanks!

gunshi avatar Nov 06 '23 09:11 gunshi

We report the average success rate over three seeds x three camera angles (except for Robel Suite where we use all the camera angles). We use the last checkpoint to measure the success rate.

ShahRutav avatar Dec 09 '23 16:12 ShahRutav