llama-recipes
llama-recipes copied to clipboard
Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a...
Initial fix for # https://github.com/meta-llama/llama-recipes/issues/633
Hello, Thank you for providing these valuable recipes. I appreciate your work. I'm interested in **further pre-training the Llama3.1-8B-base model rather than using the instruct version**. To ensure I prepare...
### System Info PyTorch: 2.3 Cuda: 12.1 ### Information - [ ] The official example scripts - [ ] My own modified scripts ### π Describe the bug I got...
### π The feature, motivation and pitch Hey folks, big fan of your work, especially all the details provided around evaluations! I am trying to reproduce results for [MultiPL-E](https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/eval_details.md#multipl-e-humaneval-and-multipl-e-mbpp) without...
### π The feature, motivation and pitch Itβs time to design a logo ### Alternatives more ### Additional context _No response_
Improve model checkpoint saving logic to always save model checkpoint when validation is not run # What does this PR do? This PR enhances the model checkpoint saving logic to...
# What does this PR do? availble -> available ## Feature/Issue validation/testing Please describe the tests that you ran to verify your changes and relevant result summary. Provide instructions so...
### System Info N/A ### Information - [X] The official example scripts - [ ] My own modified scripts ### π Describe the bug The llama.com website states: > The...
### π The feature, motivation and pitch Is there any plan to add [FSDP2](https://github.com/pytorch/torchtitan/blob/main/docs/fsdp.md) for training? ### Alternatives _No response_ ### Additional context _No response_