openpi icon indicating copy to clipboard operation
openpi copied to clipboard

Finetuning on DROID

Open ponimatkin opened this issue 7 months ago • 2 comments

Hi @kpertsch,

thanks for the great work! I wanted to ask, would you perhaps mind sharing LR schedule, batch size etc for pi0 and pi0-FAST DROID finetunes?

I'm trying to finetune pi0 on DROID from scratch and I'm encountering some weird divergences after some time. Currently I'm following FAST paper, in using only successful episodes and using batch size of 256.

Thanks!

ponimatkin avatar Apr 24 '25 21:04 ponimatkin

Hi! Batch size 256 should work! We used learning rate 5e-5. I haven't observed divergences in training though so that's a bit fishy! You may also want to start with FAST fine-tuning (vs pi0 diffusion style training) -- it should converge more quickly, so will give you quicker iteration on your setup!

-- Karl

kpertsch avatar Apr 24 '25 21:04 kpertsch

Hi @kpertsch, sorry for reopening the issue. I'm starting to get somewhere, but seems like DROID finetuning is very sensitive to LR and schedule. :) Is this expected?

Also, would you please mind sharing:

  1. Which optimizer was used? SGD or AdamW?
  2. Which LR schedule was used? Cosine with warmup, constant or something else?
  3. If cosine schedule was used, which final LR and how many warmup steps were used?
  4. Last but not least, it seems DROID finetuning is dependent on action_horizon parameter. In case of FAST I see no issues with changing it, but for diffusion pi0, the provided checkpoint has action_horizon = 10 but original value is action_horizon = 50, did you also train for action_horizon = 10 in case of diffusion pi0 on DROID?

Thank you for your help!

Best Georgy

Edit: Ok i see that points 1-3 were answered in Pi0-FAST paper, my bad, so only last one remains. :)

ponimatkin avatar Apr 27 '25 12:04 ponimatkin