trl
trl copied to clipboard
Small changes when integrating into H4
Two changes:
- Pass the optimizer in the sentiment example (currently variable was not passed into trainier).
- [I think] fix the kwarg option for wandb config of
Accelerate
. See this docs page, whereinit_kwargs
is handled differently. In trying to use this with the code as is,wandb
is getting read as akwarg
and not handled correctly by this line. If this is different in Tensorboard, it may just be incompatible.
Let me know if I'm wrong!
Fixes: #215
Closes #https://github.com/lvwerra/trl/issues/215 if correct on point 1 @younesbelkada !
The documentation is not available anymore as the PR was closed or merged.
I tested the logging change with my code in H4 #https://github.com/huggingface/h4/pull/73, and it fixed my problem!
I'll test tensorboard
today. FYI this is needed for the script in H4, so I'll be motivated to get this working soon.
If tensorboard
doesn't work, I'll prolly do an if statement.
@younesbelkada I think I ran this with tensorboard
(just changed the config to as follows and it didn't error). Seems good to me?
The term I changed tracker_kwargs
was not used in any of TRL to date actually.
config = PPOConfig(
model_name="ybelkada/gpt-j-6b-sharded-bf16",
learning_rate=(1.47e-5) * 2,
# log_with="wandb",
log_with="tensorboard",
accelerator_kwargs={"logging_dir": '/home/nathan/logs/'},
batch_size=32,
forward_batch_size=1,
)
Thanks a lot for experimenting @natolambert ! LGTM