petals Exploding gradients/weights/logits in a fine-tuning notebook

Exploding gradients/weights/logits in a fine-tuning notebook

Open 81549361 opened this issue 1 year ago • 2 comments

Hello, after I successfully fine-tuned, I run to the last step and when generating text, the following error occurs. May I ask what is the reason?

Mar 10 '23 03:03 81549361

Hi @81549361, sorry for the long response.

One possible reasons is that model weights/gradients have exploded during training. You would see this on the loss function plot in wandb.

We'll double-check training hyperparams and fix them to avoid this behavior (cc @justheuristic).

Mar 30 '23 03:03 borzunov

Hi @81549361, we've just pushed a commit that should fix the explosion in https://github.com/bigscience-workshop/petals/pull/343. Can you try running the updated SST-2 prompt tuning notebook from the main branch?

Jul 12 '23 13:07 mryab

petals petals copied to clipboard

Exploding gradients/weights/logits in a fine-tuning notebook

petals
petals copied to clipboard