torchtune
torchtune copied to clipboard
[Multi-Node] Test ``recipes/knowledge_distillation_distributed.py`` on a multi-node setup
Goal
Confirm that recipes/knowledge_distillation_distributed.py works on a multi-node setup
Validation
Run the recipe, confirm the logs are correct. Prefer TensorBoard or Weights&Biases charts, as well.
Artifacts
If everything works as expected, please submit a PR updating the chart. If you hit any errors or edge cases, please feel free to fix them in your PR; however, if you don't want to do that you can also update this Github Issue with the problems for someone else to work on.
I will take this up
I will take this up
Thanks! Tag me on the PR when you're done :)