llm-foundry
llm-foundry copied to clipboard
LLM training code for Databricks foundation models
I'm trying to fine-tune dbrx on a single machine with 8 H100 gpus. I keep getting OOM error with different configurations, I wonder if this is even doable. I see...
**[WIP] Fix Batching Adds a wrapper, similar to the OpenAI Wrappers in [this PR](https://github.com/mosaicml/llm-foundry/pull/494), for TRT models. The purpose is to be able to evaluate TRT models using our gauntlet,...
## Context We have a MPT MoD prefix-lm trained on llm-foundry and then exported to HuggingFace (via your scripts). For some fine-tuning experiments with the HF model, I tried to...
This PR is stacked on top of the migration PR https://github.com/mosaicml/llm-foundry/pull/936 It does 5 things 1. Refactor CodeEval and QA tasks to have a shared superclass called InContextLearningGenerationTaskDataset 2. Rename...
JIRA: https://databricks.atlassian.net/jira/software/c/projects/STR/issues/STR-141?filter=allissues This script is useful in scenarios where the FT API data input has been malformed. It acts as a preventive measure to ensure data integrity and helps in...
* add notebook/data_validation_notebook which runs data preparation and token counting from byod/data_validation branch. Merged to main to keep underlying functions up-to-date. * add utils functions used by notebook/data_validation_notebook * shuffle...
Enable SpeedMonitor on HF models by using PyTorch FlopCounterMode to calculate model FLOPs.