G. Zang

Results 3 issues of G. Zang

### Checklist - [x] 1. I have searched related issues but cannot get the expected help. - [ ] 2. The bug has not been fixed in the latest version....

Thank you for your work! I would like to add a multi-class classification layer after the LLM. Is it possible to directly add it using your code?

Thanks for you excellent work! I train RL for 20 steps using your cold-start model, but the fornat reward is always 0.