torchtune icon indicating copy to clipboard operation
torchtune copied to clipboard

Update models docs page

Open joecummings opened this issue 9 months ago • 5 comments

Context: Our models page was not informative and things like tokenizers were missing.

Testing: Eyes

joecummings avatar May 08 '24 21:05 joecummings

:link: Helpful Links

:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/954

Note: Links to docs will display an error until the docs builds have been completed.

:white_check_mark: No Failures

As of commit 165df0061c41d5998c99747cdb2975c2a98603fd with merge base 1e56b807a4124bcd46c7e93f82897eac22e758ee (image): :green_heart: Looks good so far! There are no failures yet. :green_heart:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

pytorch-bot[bot] avatar May 08 '24 21:05 pytorch-bot[bot]

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 27.05%. Comparing base (a978956) to head (57c991c). Report is 32 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff             @@
##             main     #954       +/-   ##
===========================================
- Coverage   66.39%   27.05%   -39.35%     
===========================================
  Files         155      174       +19     
  Lines        6484     7363      +879     
===========================================
- Hits         4305     1992     -2313     
- Misses       2179     5371     +3192     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar May 08 '24 22:05 codecov-commenter

Overall this is a huge improvement. I'm not married to the blurbs - we can link to model cards/papers which sufficiently explain the models IMO.

I'd love to see the following things (some of which are here already):

  • A top-level note to let users know they can swap all the models they see here across all of our recipes. How much detail we provide on doing so vs. linking to other docs I'm open to discussing.
  • Memory usage of different models - I think this is just a nice QOL add, even just to say the memory usage default for bf16.
  • The explainer on generic builders has been sorely missing. I wouldn't mind a little example showing when to use one (e.g. for tinyllama or something)
  • Would it be inappropriate to immortalize a contributor's contribution if they contributed a specific model, by linking to their contribution in the model section?
  • I think the tokenizer sections could use a little love, particularly since we've fleshed these out a bit. For example, worth specifying llama tokenizer is a sentencepiece tokenizer, and not the fast BPE tokenizer?

More thoughts maybe to come. Thanks for reviving this.

SalmanMohammadi avatar Aug 06 '24 17:08 SalmanMohammadi

Also, maybe this is obvious to everyone, but it might be a good opportunity to add a little noobsplainer™ about what's up with all our lora model and component builders. We can defer to the tutorial for in-depth details here. For example, lora_llama2_7b isn't actually another Llama2 model you can download or find the weights for, it's just the LoRA version of the same model, and here's how to specify it, et cetera. Maybe my section in the memory glossary is a good link here?

SalmanMohammadi avatar Aug 06 '24 17:08 SalmanMohammadi

Overall I think this is a huge improvement, we just need to ensure the updates are applied consistently across all the different models we support. A few responses to various comments I've seen so far:

(1) I see a few comments about noise/information ratio, especially around which APIs we do and don't document. Imo we shouldn't compromise on documenting our public APIs, even if it means there is extra stuff on our docs pages. Instead, we should structure our docs such that the important stuff is easy to find, and the less pressing stuff is still available. I think separating models out into separate pages will help, as will structuring each individual page from most essential info at the top to least essential info at the bottom.

(2) Re stuff like commands to fine-tune a given model or a given model's memory, I don't think we should put it here. Instead, we should provide this information elsewhere and link to it. These are API docs and so first and foremost their job is to provide everything a user needs to know about interacting with said APIs. So the existing example usage is the right idea imo -- it shows how to use the provided APIs in Python in a minimal and self-contained way. (Though I would think about how we can make the example usage even a bit clearer, the checkpointer portion is unfortunately a bit of a black box in its current format.)

Would it be inappropriate to immortalize a contributor's contribution if they contributed a specific model, by linking to their contribution in the model section?

This I wouldn't do personally. Unlike tutorials where we can reference a contributor in a single line at the top, our API docs shouldn't have any extra fluff

ebsmothers avatar Aug 08 '24 03:08 ebsmothers