torchtitan
torchtitan copied to clipboard
Any plans to support DPO training?
Out of curiosity, what gaps are you seeing with DPO in torchtune (https://github.com/pytorch/torchtune/blob/main/docs/source/recipes/dpo.rst)?
E.g. multi-node support? anything else?
context parallelism is one of the major features missing in torchtune.
Is this an open issue? I'm working on an implementation that could be helpful, perhaps.