Jinhua Wang issues

Repositories
Issues
Comments

Results 12 issues of


Jinhua Wang

Should critic's input be prompt only?

In the PPO implementation, it seems that the critic model considers both prompt and generated actions as the input (if pooled is true, then generated actions only). However, if we...

Sequence Parallelism Aware Training Loss

Hi there! When you are training with sequence parallel attention, I was wondering if you scale the loss function properly, as each GPU card will only contain a subset of...