petals
petals copied to clipboard
Exploding gradients/weights/logits in a fine-tuning notebook
Hello, after I successfully fine-tuned, I run to the last step and when generating text, the following error occurs. May I ask what is the reason?
Hi @81549361, sorry for the long response.
One possible reasons is that model weights/gradients have exploded during training. You would see this on the loss function plot in wandb.
We'll double-check training hyperparams and fix them to avoid this behavior (cc @justheuristic).
Hi @81549361, we've just pushed a commit that should fix the explosion in https://github.com/bigscience-workshop/petals/pull/343. Can you try running the updated SST-2 prompt tuning notebook from the main branch?