distilabel
distilabel copied to clipboard
[IMPLEMENTATION] Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models
https://arxiv.org/abs/2406.13542