llm-foundry icon indicating copy to clipboard operation
llm-foundry copied to clipboard

LLM training code for Databricks foundation models

Results 267 llm-foundry issues
Sort by recently updated
recently updated
newest added

## 🚀 Feature Request Support DPO (Direct Preference Optimization) loss and data loader. ## Motivation Many recent open LLMs have achieved promising results from using DPO instead of RL-style tuning...

enhancement

Hi, Llama 3 trains like this > We trained the models on sequences of 8,192 tokens, using a mask to ensure self-attention does not cross document boundaries. I see you...

enhancement

Added [FreebaseQA](https://aclanthology.org/N19-1028/) which is used in several recent papers [1](https://link.springer.com/chapter/10.1007/978-3-031-44216-2_28) [2](https://arxiv.org/abs/2403.09712) [3](https://openreview.net/forum?id=B9klVS7Ddk). Tested on [Vicuna-7B](https://huggingface.co/lmsys/vicuna-7b-v1.3) and obtained similar results to [3](https://openreview.net/forum?id=B9klVS7Ddk). Feel free to cherry-pick if we only need the...

This issue is to track the addition of state space models and mamba layer support to the llm-foundry project. These features are essential for enhancing the capabilities of the project...

https://dbc-04ac0685-8857.staging.cloud.databricks.com/ml/experiments/974778159721961/runs/5e260e958bef4d3d82b06bce55224ebb?o=3360802220363900

- Added `convert_mosaicbert_to_hf.py` inside `inference` folder. - `inference/convert_composer_to_hf.py` was only able to convert MPT and other CausalLMs to HF. Adapted that code to work on mosaic_bert - The python script...

OpenAI run: `api-eval-Ik2iMA` ``` | Category | Benchmark | Subtask | Accuracy | Number few shot | Model | |:-----------|:----------------|:------------------------------------|-----------:|:------------------|:------------------------------| | | gsm8k | | 0.482942 | 0-shot | openai/gpt-3.5-turbo-instruct...