openpi
openpi copied to clipboard
Finetuning on DROID
Hi @kpertsch,
thanks for the great work! I wanted to ask, would you perhaps mind sharing LR schedule, batch size etc for pi0 and pi0-FAST DROID finetunes?
I'm trying to finetune pi0 on DROID from scratch and I'm encountering some weird divergences after some time. Currently I'm following FAST paper, in using only successful episodes and using batch size of 256.
Thanks!
Hi! Batch size 256 should work! We used learning rate 5e-5. I haven't observed divergences in training though so that's a bit fishy! You may also want to start with FAST fine-tuning (vs pi0 diffusion style training) -- it should converge more quickly, so will give you quicker iteration on your setup!
-- Karl
Hi @kpertsch, sorry for reopening the issue. I'm starting to get somewhere, but seems like DROID finetuning is very sensitive to LR and schedule. :) Is this expected?
Also, would you please mind sharing:
- Which optimizer was used? SGD or AdamW?
- Which LR schedule was used? Cosine with warmup, constant or something else?
- If cosine schedule was used, which final LR and how many warmup steps were used?
- Last but not least, it seems DROID finetuning is dependent on
action_horizonparameter. In case of FAST I see no issues with changing it, but for diffusion pi0, the provided checkpoint hasaction_horizon = 10but original value isaction_horizon = 50, did you also train foraction_horizon = 10in case of diffusion pi0 on DROID?
Thank you for your help!
Best Georgy
Edit: Ok i see that points 1-3 were answered in Pi0-FAST paper, my bad, so only last one remains. :)