llm-foundry issues

Loss explodes with Flash/Triton Attention

10

Hi! using this docker image `mosaicml/llm-foundry:2.0.1_cu118-latest` I'm training mpt-125m with your default parameters, my loss explodes after some number of steps I have added warmup 2k steps as well It...

germanjke

Verify icl cfgs

Run with amp_fp16: | Benchmark | Subcategory | Accuracy | Number few shot | Model | |:---------------|:------------------------------------|-----------:|------------------:|:----------------| | jeopardy | Average | 0.279767 | 0 | mosaicml/mpt-7b | | |...

bmosaicml

Constant training loss observed when using mpt-7b_dolly_sft.yaml config

2

Hello, I'm trying to fine-tune MPT-7B starting from `mpt-7b_dolly_sft.yaml`, but I observe that train loss, cross entropy, and perplexity are all fixed at a constant value throughout the entire training...

suehyunpark

Small formatting fix in eval README

sashaDoubov

Sam/chat v2

This fixes a bug in `hf_chat.py` where the custom system prompt and user and assistant format strings were ignored It also cleans up the implementation and does streaming generation

samhavens

Multi GPU inference

5

## ❓ Question Hi, I am trying to run zero-shot evaluation for the 30 billion `llama-30b`. Even for a `batch_size = 1`, I am getting a `torch.cuda.OutOfMemoryError: CUDA out of...

ashim95

question

Adding Custom Embedding, Enabling us to initialize on Heterogeneous Devices

1

The below code: - adds a SharedEmbedding class that let's us get rid of a `F.linear` call. This is necessary with certain wrapping structures (our HF ones), otherwise FSDP emits...

bcui19

llm-foundry
llm-foundry copied to clipboard

Metadata

Loss explodes with Flash/Triton Attention

Verify icl cfgs

Constant training loss observed when using mpt-7b_dolly_sft.yaml config

Small formatting fix in eval README

Sam/chat v2

Multi GPU inference

Adding Custom Embedding, Enabling us to initialize on Heterogeneous Devices

Fix autocast dtype

Update README.md

Add `save_weights_only` as an option

← Metadata

Owner

Metadata

llm-foundry llm-foundry copied to clipboard

Metadata

← Metadata

Owner

Metadata

llm-foundry
llm-foundry copied to clipboard