trl
trl copied to clipboard
allow ref model use ds stage3 only
allow ref model and avtive model use different ds stage. i use 7b model to test, ref model use stage3 and active model use stage2 is 25% fast than ref model and active model both use stage3. if ref model and active model both use stage2 will cause oom
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.
Nice
The code quality didn't pass https://github.com/huggingface/trl/actions/runs/9543664472/job/26304378146?pr=1730 can you double check ? @gromzhu 🙏
The code quality didn't pass https://github.com/huggingface/trl/actions/runs/9543664472/job/26304378146?pr=1730 can you double check ? @gromzhu 🙏
sorry for this mistake,i don‘t know how to run precommit local properly.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.