Stephen Wang comments

Repositories
Issues
Comments

Results 3 comments of


                                            Stephen Wang

[ChatLLaMA] RLHF Training: Prompt too long

Thanks for the recommendations. The error still persists unfortunately. Can I simply increase the additional_prompt_tokens or would I need to save a new actor model? Below is my config.yaml ```...

[ChatLLaMA] RLHF Training: Prompt too long

Hi @PierpaoloSorbellini thank you for rolling out the fixes. This might not be very specific, but although I was able to get further into training, around the 9th timestep the...

Issues with accelerate and deepspeed training

Hi @EikeKohl. Yes, here is my config.yaml ``` --- trainer_config: # learning rates actor_lr: 0.000005 critic_lr: 0.000009 # PPO Hyperparameters actor_eps_clip: 0.2 critic_eps_clip: 0.2 beta_s: 0.02 # coefficient for the...