MSML
MSML copied to clipboard
SFT/test_data
Please share the SFT/test_data mentioned in the eval folder so that evaluation can be replicated for other choice of input models. In addition, please please share the SFT models so that I have my own sanity check of what's the difference between SFT and DPO models.
SFT data is here https://huggingface.co/datasets/morganstanley/sft-python-q-problems-sft. We'll upload the 32B intermediate checkpoints (sft and non-reasoning rl) today and do the same for 7B models later. Thank you for the reminder.