August Moharrami
August Moharrami
@lewtun Shouldn't this be a part of Datasets instead? Datasets already has [Interleaves](https://huggingface.co/docs/datasets/v3.0.1/en/process#interleave), which also mixes datasets in a similar way. It's not quite integrable as it is, but may...
> since the logits/dictionary needs to match between the teacher and student model, I do not thinks possible to train with closed models Anthropic API doesn't output any logits or...
@lewtun After reading the paper, I noticed that the DPO checkpoints were combined with a different model rather than the reference model used in DPO training. So, I added an...
@coding-famer The callback has an optional parameter called `merge_at_every_checkpoint`, which merges the saved checkpoint at either every step or at the end of each epoch during training.
This could go up on r/programmerhorror Sir, I too struggled to understand the problem. From what I gathered, I have to ask: Why would you group similar prompts together? There's...
@qgallouedec it's mostly `dummy-GPT2-correct-vocab` Do you want to replace those too?
Could you please clarify what's needed here? It seems that none of the example scripts currently take `chat_template` as input.