llm-foundry
llm-foundry copied to clipboard
LLM training code for Databricks foundation models
Adding option for softcap in attention and lm_head logits, to allow Gemma-like models. The config names are same as the huggingface names here: https://github.com/huggingface/transformers/blob/96a074fa7e2c04b904f72d9e827398d4c5f90f25/src/transformers/models/gemma2/modeling_gemma2.py#L371
This PR upgrades the Habana support to llm-foundry v0.10.0
A similar feature to the [library usage of lm-eval-harness](https://github.com/EleutherAI/lm-evaluation-harness/blob/main/docs/interface.md) using a simple registry add-on.
Updates the requirements on [datasets](https://github.com/huggingface/datasets) to permit the latest version. Release notes Sourced from datasets's releases. 2.20.0 Important Remove default trust_remote_code=True by @lhoestq in huggingface/datasets#6954 datasets with a python loading...
This is a new callback to simplify logging environment metadata for reproducibility purposes: - Git Commits for packages under workspace_dir, useful for mcli integrations - Package versions for core dependencies,...
Hi, I was trying to run multi-node training on slurm nodes but I have no idea how to configure `composer` arguments and commands. Is there any example script to run...
## 🚀 Feature Request The current StreamingTextDataset truncate the text/tokens to the max_seq_len directly and throw out all left text/tokens. It is possible to support the truncate the text/tokens to...
Creates two exceptions: - `DatasetMissingFileError` --> Tells the user that a dataset file could not be found during a failure in the finetuning dataloader where `split` / `path` were not...
Hello, I'm currently training LLaMA PRO. Initially, I expanded the model from 32 layers to 40 layers and proceeded to train only the newly added 8 layers (every fifth layer)....
Hi! Do you support fill in the middle technique in pretrain pipelines? If yes, do you have some documentation about this? Thanks!