llm-foundry icon indicating copy to clipboard operation
llm-foundry copied to clipboard

LLM training code for Databricks foundation models

Results 267 llm-foundry issues
Sort by recently updated
recently updated
newest added

Hi, Do you support Flash Attention 3 version? Or only FA2?

It is not clear if it is possible to convert a MOE model from hf, say Mixtral 8x7B into dmoe layers and exploit expert parallelism. Is it possible to start...

question

When a run is killed and automatically restarts after training is complete, it is thrown into a retry loop due to Composer erroring when no training will happen. This PR...

I encountered a bug during the usage of `composer.utils.dist.get_node_signal_file_name`. ## Setup - llm-foundry==release/v0.17.1 If I execute a training script on a single node I have no issue and the training...

bug

It seems that only Zero3/DP (i.e. FSDP, or HSDP) are supported in LLM foundry, while other parallelization techniques like Tensor Parallelism (TP), Pipeline Parallelism (PP) and Sequence Parallelism (or Context...

enhancement

## 🚀 Feature Request Provide a metric that uses [Math-Verify](https://github.com/huggingface/Math-Verify) to parse and compare mathematical expressions with more flexibility than `InContextLearningGenerationExactMatchAccuracy`. ## Motivation https://huggingface.co/blog/math_verify_leaderboard reports that overly simple methods for...

enhancement

I spent a lot of time trying to get a custom metric working only to realize that it wasn't showing up in the output simply because https://github.com/LocalResearchGroup/llm-foundry/blob/a4bd9fc08f8dac482970d42a94ef2cdda2659a60/llmfoundry/command_utils/eval.py#L473-L474 looks for "Accuracy"...

## Environment ``` composer_collect_env Collecting system information... --------------------------------- System Environment Report Created: 2025-02-02 12:32:57 CST --------------------------------- PyTorch information ------------------- PyTorch version: 2.5.1 Is debug build: False CUDA used to build...

bug