torchtune icon indicating copy to clipboard operation
torchtune copied to clipboard

[RFC][DOCS] Recipe [DOCS] ([DOC]umentation)

Open SalmanMohammadi opened this issue 1 year ago • 11 comments

What is the purpose of this PR? Is it to

  • [x] add a new feature
  • [ ] fix a bug
  • [x] update tests and/or documentation
  • [ ] other (please add here)

Let's map out a user journey here:

I'm a l33t gamer with a 2xRTX 4090 battlestation. I've just beat Cyberpunk (on max settings, mind you), and now I'd like to get into fine-tuning LLMs. I'm excited about the latest release of my favourite animal-based LLM, Llama3.1. What do I want to do? Not 100% sure - fine-tuning an LLM for my Discord? The LLama 3.1 documentation has helped me find ✨torchtune✨! What can I do with it?? How can I quickly discover this? Oh, cool, there's a quick tutorial for fine-tuning Llama3? And it uses LoRA? Great! And there's 🔥🔥🔥 documentation on customizing my own datasets? Amazing!! Now, what else can I do with this LoRA "recipe"? Isn't there a page where I can understand what all these parameters mean, at once? ... Well, what other recipes are there? What else can I do with ✨torchtune✨? ...

Right now, don't have a clear way to communicate to our users which recipes we support, and how to quickly configure them. Documentation for recipes is:

  • Hidden inside recipe files. Recipe-specific features are listed amongst features common to all recipes. This makes the documentation a bit difficult to parse.
  • Hidden inside our config files, which are also replicated across our configs. Our config files contain some of the more crucial details when using recipes; which commands to run, which levers to pull.

I want to maximise the surface area of the features we expose in torchtune. Our design philosophy keeps things flat and modular, users can swap out components, models, and datasets freely. I wish for the ML PhD and the l33t gamer to be able to discover what we offer, and how to use it, with equal ease.

My contribution addresses this in the following ways:

  • I propose a glossary of features which are common across all of our recipes, and also common amongst recipes with specialised fine-tuning features such as PEFT or FSDP(2).
  • I propose a simple recipe documentation template. This template uses the commands we'd usually place inside the config files to allow users to quickly get started with the recipe. This template then includes copy-and-paste text for lists of features which we commonly expose in recipes - these link to relevant sections in the glossary above.
  • These recipes are then simply indexed in the recipe overview.

I provide two examples in this PR; documentation for LoRA single device, and for QAT distributed. I'd like to put an issue up for documenting additional recipes so other people may help out here - it'll be a good first issue for many contributors..

There's also a couple things missing from the memory glossary; FSDP/FSDP2 (and maybe something else?), I'll also put issues up for this.

Test plan

⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⣀⣤⣤⣤⣤⣴⣤⣤⣄⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⣀⣴⣾⠿⠛⠋⠉⠁⠀⠀⠀⠈⠙⠻⢷⣦⡀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⣤⣾⡿⠋⠁⠀⣠⣶⣿⡿⢿⣷⣦⡀⠀⠀⠀⠙⠿⣦⣀⠀⠀⠀⠀ ⠀⠀⢀⣴⣿⡿⠋⠀⠀⢀⣼⣿⣿⣿⣶⣿⣾⣽⣿⡆⠀⠀⠀⠀⢻⣿⣷⣶⣄⠀ ⠀⣴⣿⣿⠋⠀⠀⠀⠀⠸⣿⣿⣿⣿⣯⣿⣿⣿⣿⣿⠀⠀⠀⠐⡄⡌⢻⣿⣿⡷ ⢸⣿⣿⠃⢂⡋⠄⠀⠀⠀⢿⣿⣿⣿⣿⣿⣯⣿⣿⠏⠀⠀⠀⠀⢦⣷⣿⠿⠛⠁ ⠀⠙⠿⢾⣤⡈⠙⠂⢤⢀⠀⠙⠿⢿⣿⣿⡿⠟⠁⠀⣀⣀⣤⣶⠟⠋⠁⠀⠀⠀ ⠀⠀⠀⠀⠈⠙⠿⣾⣠⣆⣅⣀⣠⣄⣤⣴⣶⣾⣽⢿⠿⠟⠋⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠉⠙⠛⠛⠙⠋⠉⠉⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀

SalmanMohammadi avatar Jul 26 '24 13:07 SalmanMohammadi

:link: Helpful Links

:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/1230

Note: Links to docs will display an error until the docs builds have been completed.

:white_check_mark: No Failures

As of commit 3d06179198628b077b409bf3ae6f63141a66ba34 with merge base 3653c4aaeb2504374bc701c56f3121584727a144 (image): :green_heart: Looks good so far! There are no failures yet. :green_heart:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

pytorch-bot[bot] avatar Jul 26 '24 13:07 pytorch-bot[bot]

So sorry I fat-thumbed the request reviewer button.

SalmanMohammadi avatar Aug 02 '24 17:08 SalmanMohammadi

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 70.77%. Comparing base (3653c4a) to head (dc3bcbf). Report is 24 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1230      +/-   ##
==========================================
- Coverage   72.11%   70.77%   -1.34%     
==========================================
  Files         233      258      +25     
  Lines       10603    11904    +1301     
==========================================
+ Hits         7646     8425     +779     
- Misses       2957     3479     +522     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Aug 02 '24 17:08 codecov-commenter

Thank you so much @RdoubleA. Sorry about all the typos. I need to get a spell checker for my editor.

SalmanMohammadi avatar Aug 03 '24 14:08 SalmanMohammadi

I would even go further and maybe list a table of all the methods and if it boosts memory/speed, with some recommendations ("If you want more speed, do X, if you want reduced memory, do Y).

Seems like https://github.com/pytorch/torchtune/issues/1252 would be well placed to address this?

SalmanMohammadi avatar Aug 05 '24 10:08 SalmanMohammadi

Seems like https://github.com/pytorch/torchtune/issues/1252 would be well placed to address this?

Yes, if this is something @felipemello1 is planning to do. But we can leave as a follow-up

RdoubleA avatar Aug 05 '24 14:08 RdoubleA

@RdoubleA think I've addressed everything. Thanks again for such a thorough review

SalmanMohammadi avatar Aug 05 '24 17:08 SalmanMohammadi

Okay, okay, I'll add a table. I saw this https://pytorch.org/docs/stable/torch.compiler_fine_grain_apis.html and I liked it.

SalmanMohammadi avatar Aug 13 '24 22:08 SalmanMohammadi

Think I've addressed all the comments. Added a table at the top of the tutorial. Have DPO/PPO recipes docs in the oven too.

SalmanMohammadi avatar Aug 19 '24 13:08 SalmanMohammadi

thanks for making the changes. I think that they addressed most if not all of my concerns.

felipemello1 avatar Aug 19 '24 15:08 felipemello1

thanks for making the changes. I think that they addressed most if not all of my concerns.

tysm for reviewing : )

SalmanMohammadi avatar Aug 19 '24 15:08 SalmanMohammadi