trl Small changes when integrating into H4

Small changes when integrating into H4

Open natolambert opened this issue 1 year ago • 4 comments

Two changes:

Pass the optimizer in the sentiment example (currently variable was not passed into trainier).
[I think] fix the kwarg option for wandb config of Accelerate. See this docs page, where init_kwargs is handled differently. In trying to use this with the code as is, wandb is getting read as a kwarg and not handled correctly by this line. If this is different in Tensorboard, it may just be incompatible.

Let me know if I'm wrong!

Fixes: #215

Mar 14 '23 03:03 natolambert

Closes #https://github.com/lvwerra/trl/issues/215 if correct on point 1 @younesbelkada !

Mar 14 '23 03:03 natolambert

The documentation is not available anymore as the PR was closed or merged.

Mar 14 '23 03:03 HuggingFaceDocBuilderDev

I tested the logging change with my code in H4 #https://github.com/huggingface/h4/pull/73, and it fixed my problem!

Mar 14 '23 03:03 natolambert

I'll test tensorboard today. FYI this is needed for the script in H4, so I'll be motivated to get this working soon.

If tensorboard doesn't work, I'll prolly do an if statement.

Mar 14 '23 16:03 natolambert

@younesbelkada I think I ran this with tensorboard (just changed the config to as follows and it didn't error). Seems good to me?

The term I changed tracker_kwargs was not used in any of TRL to date actually.

config = PPOConfig(
    model_name="ybelkada/gpt-j-6b-sharded-bf16",
    learning_rate=(1.47e-5) * 2,
    # log_with="wandb",
    log_with="tensorboard",
    accelerator_kwargs={"logging_dir": '/home/nathan/logs/'},
    batch_size=32,
    forward_batch_size=1,
)

Mar 14 '23 19:03 natolambert

Thanks a lot for experimenting @natolambert ! LGTM

Mar 14 '23 19:03 younesbelkada

trl trl copied to clipboard

Small changes when integrating into H4

trl
trl copied to clipboard