Allan Jie
Results
72
comments of
Allan Jie
Yes. Removing the `--use_kernel` make it work. Yeah, I realize the DeepSpeed FastGen. Wondering, how does it support the batch size? Or I simply make a for loop about that
I'm really confused that, when you run PPO without SFT, for example, in Narrative QA? How do the (quite-small) model knows it should generate an answer?