trl icon indicating copy to clipboard operation
trl copied to clipboard

Small changes when integrating into H4

Open natolambert opened this issue 1 year ago • 4 comments

Two changes:

  1. Pass the optimizer in the sentiment example (currently variable was not passed into trainier).
  2. [I think] fix the kwarg option for wandb config of Accelerate. See this docs page, where init_kwargs is handled differently. In trying to use this with the code as is, wandb is getting read as a kwarg and not handled correctly by this line. If this is different in Tensorboard, it may just be incompatible.

Let me know if I'm wrong!

Fixes: #215

natolambert avatar Mar 14 '23 03:03 natolambert

Closes #https://github.com/lvwerra/trl/issues/215 if correct on point 1 @younesbelkada !

natolambert avatar Mar 14 '23 03:03 natolambert

The documentation is not available anymore as the PR was closed or merged.

I tested the logging change with my code in H4 #https://github.com/huggingface/h4/pull/73, and it fixed my problem!

natolambert avatar Mar 14 '23 03:03 natolambert

I'll test tensorboard today. FYI this is needed for the script in H4, so I'll be motivated to get this working soon.

If tensorboard doesn't work, I'll prolly do an if statement.

natolambert avatar Mar 14 '23 16:03 natolambert

@younesbelkada I think I ran this with tensorboard (just changed the config to as follows and it didn't error). Seems good to me?

The term I changed tracker_kwargs was not used in any of TRL to date actually.

config = PPOConfig(
    model_name="ybelkada/gpt-j-6b-sharded-bf16",
    learning_rate=(1.47e-5) * 2,
    # log_with="wandb",
    log_with="tensorboard",
    accelerator_kwargs={"logging_dir": '/home/nathan/logs/'},
    batch_size=32,
    forward_batch_size=1,
)

natolambert avatar Mar 14 '23 19:03 natolambert

Thanks a lot for experimenting @natolambert ! LGTM

younesbelkada avatar Mar 14 '23 19:03 younesbelkada