robosuite-benchmark
robosuite-benchmark copied to clipboard
Number Training Steps for Baselines
Hi,
I am currently working on reproducing the baseline that you are proposing as plots in the robosuite whitepaper for the Wipe environment. In your whitepaper your are writing:
All agents were trained for 500 epochs with 500 steps per episode
From this I conclude that the overall number of training steps is 500*500, hence 250 000 However in your file:
https://github.com/ARISE-Initiative/robosuite-benchmark/blob/master/runs/Wipe-Panda-OSC-POSE-SEED83/Wipe_Panda_OSC_POSE_SEED83_2020_09_21_23_14_04_0000--s-0/variant.json
You state that "num_expl_steps_per_train_loop": 2500
and "num_epochs": 2000
. Therefore, 2500 steps are done per epoch for 2000 epochs amounting to 2500*2000=5 000 000 training steps.
Then looking in the file with the training statistics:
https://github.com/ARISE-Initiative/robosuite-benchmark/blob/master/runs/Wipe-Panda-OSC-POSE-SEED83/Wipe_Panda_OSC_POSE_SEED83_2020_09_21_23_14_04_0000--s-0/progress.csv
Training was done for 720 epochs amounting to 1 825 800 training steps (and with 2500 steps per epoch as can be seen from the growing replay buffer).
I assume that in your whitepaper you did 500 epochs but with 2500 steps per episode. Therewith, 1 250 000 train steps overall. Can you confirm which of the numbers is actually the correct one that you propose to use?