trl icon indicating copy to clipboard operation
trl copied to clipboard

[CLI] Extend training support to all trainers

Open lewtun opened this issue 1 year ago • 6 comments

Feature request

The CLI currently supports training models with SFT/DPO/KTO: https://github.com/huggingface/trl/blob/6859e048da601fec181997a324e7b351fc997a33/trl/commands/cli.py#L24

It would be good to extend this support to all trainers so that we have a consistent API and also learn which parts of our scripts need refactoring to support this usage.

This could be tackled in separate PRs to keep things lightweight, and I'll track here the trainers in terms of priority to add (based on Hub usage):

Motivation

It is somewhat annoying that one cannot train a model through the CLI as this is helpful for fast debugging / iterations.

Your contribution

Happy to open PRs, but this could be a good first issue for new contributors!

lewtun avatar Sep 23 '24 10:09 lewtun

Duplicate #1811, closing it in favour of this one

qgallouedec avatar Sep 23 '24 11:09 qgallouedec

Hi, it looks like it'd be just the extension of the SUPPORTED_COMMANDS constant?

as for instance orpo would already be there https://github.com/huggingface/trl/blob/main/examples/scripts/orpo.py

I didn't check for the others but compared the scripts of kto with orpo for instance.

grumpyp avatar Sep 29 '24 19:09 grumpyp

@grumpyp yes I think this is all one needs - a better solution would be to have an automated way to populate this constant by globbing all the _trainer.py files (e.g. make cli_commands). This way, anytime someone adds a new trainer, we automatically get support for it in the CLI :)

lewtun avatar Sep 30 '24 07:09 lewtun

@grumpyp yes I think this is all one needs - a better solution would be to have an automated way to populate this constant by globbing all the _trainer.py files (e.g. make cli_commands). This way, anytime someone adds a new trainer, we automatically get support for it in the CLI :)

yes definitely! If you want, assign the issue to me please. I'll try to get a PR out today.

grumpyp avatar Sep 30 '24 08:09 grumpyp

assigned! thanks for the offer to help 🤗

lewtun avatar Oct 01 '24 15:10 lewtun

hi @lewtun

I didn't want to manipulate py-files via the Makefile so I went a slighly different approach.

It now creates the commands dynamically using a utility function which is cached. EDIT: I deleted the caching as it's not executed anywhere else and terminated after running so the cache is not saved.

Let me know if that works for you or if it needs changes. Thanks for the opportunity to contribute.

grumpyp avatar Oct 02 '24 10:10 grumpyp

Looks like this issue should be marked as closed.

rp440 avatar Mar 17 '25 18:03 rp440

is this issue solved, if not i want to start contributing to trl by working on it

sezan92 avatar Aug 17 '25 13:08 sezan92

Thanks! Your contribution is welcome!

qgallouedec avatar Aug 17 '25 18:08 qgallouedec