llm-foundry icon indicating copy to clipboard operation
llm-foundry copied to clipboard

LLM training code for Databricks foundation models

Results 267 llm-foundry issues
Sort by recently updated
recently updated
newest added

Adding option for softcap in attention and lm_head logits, to allow Gemma-like models. The config names are same as the huggingface names here: https://github.com/huggingface/transformers/blob/96a074fa7e2c04b904f72d9e827398d4c5f90f25/src/transformers/models/gemma2/modeling_gemma2.py#L371

This PR upgrades the Habana support to llm-foundry v0.10.0

A similar feature to the [library usage of lm-eval-harness](https://github.com/EleutherAI/lm-evaluation-harness/blob/main/docs/interface.md) using a simple registry add-on.

Updates the requirements on [datasets](https://github.com/huggingface/datasets) to permit the latest version. Release notes Sourced from datasets's releases. 2.20.0 Important Remove default trust_remote_code=True by @​lhoestq in huggingface/datasets#6954 datasets with a python loading...

dependencies

This is a new callback to simplify logging environment metadata for reproducibility purposes: - Git Commits for packages under workspace_dir, useful for mcli integrations - Package versions for core dependencies,...

Hi, I was trying to run multi-node training on slurm nodes but I have no idea how to configure `composer` arguments and commands. Is there any example script to run...

enhancement

## 🚀 Feature Request The current StreamingTextDataset truncate the text/tokens to the max_seq_len directly and throw out all left text/tokens. It is possible to support the truncate the text/tokens to...

enhancement

Creates two exceptions: - `DatasetMissingFileError` --> Tells the user that a dataset file could not be found during a failure in the finetuning dataloader where `split` / `path` were not...

Hello, I'm currently training LLaMA PRO. Initially, I expanded the model from 32 layers to 40 layers and proceeded to train only the newly added 8 layers (every fifth layer)....

question

Hi! Do you support fill in the middle technique in pretrain pipelines? If yes, do you have some documentation about this? Thanks!

enhancement