trl
trl copied to clipboard
Extra code in toxicity example
In the toxicity script should the optimizer be passed to the PPOTrainer -- or omitted?
Found this because I'm dealing with optimizer setup for H4 by copying the code over.
File "/home/nathan/h4/scripts/training/run_rl.py", line 479, in <module>
main()
File "/home/nathan/h4/scripts/training/run_rl.py", line 251, in main
optimizer = Adam(filter(lambda p: p.requires_grad, model.parameters()), lr=training_args.rl_learning_rate)
File "/opt/conda/envs/h4/lib/python3.10/site-packages/torch/optim/adam.py", line 137, in __init__
super(Adam, self).__init__(params, defaults)
File "/opt/conda/envs/h4/lib/python3.10/site-packages/torch/optim/optimizer.py", line 61, in __init__
raise ValueError("optimizer got an empty parameter list")
ValueError: optimizer got an empty parameter list