torchtune
torchtune copied to clipboard
PyTorch native post-training library
Summary: add a flag that do validation when load_checkpoint passed unexpected meta_to_tune flag. i.e. if this flag is true but checkpoint is not in meta format, or if this flag...
Currently, we only support text and images from the OpenAI format to the torchtune Messages format. We should incorporate tool calling as it is supported by the OpenAI format and...
Torchtune provides several config for fine-tuning LLM (such as [llama3_2/3B_full.yaml](https://github.com/pytorch/torchtune/blob/main/recipes/configs/llama3_2/3B_full.yaml)), and they often used alpaca dataset. Could you suggest some evaluation benchmark whether LLM is trained properly with alpaca (or...
Hi, I'd like to express my gratitude for torchtune as it provides me with a high level of abstraction when trying to experiment with various post-training strategies. However, in my...
### Goal Add a recipe entitled `fft_knowledge_distillation_distributed.py` that largely mirrors [knowledge_distillation_distributed.py](./recipes/knowledge_distillation_distributed.py) but instead of using the LoRA method of weight updating, uses full weight finetuning. ### Artifacts * One recipe...
Currently, when resuming from a previous run that utilizes a learning rate scheduler, we do NOT load a state dict from the scheduler. **But wait, does that mean our code...
Reference: https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct
Today, users have to do manual conversions between `.pth` and `.safetensors` formats before/after fine-tuning with torchtune. **Example 1: torchtitan -> torchtune -> HF transformers.** torchtitan outputs `.dcp`, which can be...
We previously didn't do this for \*\*_reasons_\*\*. Not sure what other folks do, but my typical flow for launching a finetune is currently: 1) open the config 2) copy-paste the...
Initial implementation of context parallelism in torchtune. ### Initial test ``` tune run --nproc_per_node 8 full_finetune_distributed --config llama3/8B_full \ context_parallel_dim=4 metric_logger=torchtune.training.metric_logging.WandBLogger metric_logger.project=context-parallel metric_logger.name=llama3-8b-cp4-dp2 ``` Also confirmed that we can run...