llm-foundry issues

Slow on V100

Hi teams, I'm fine-tuning with 6 V100 GPUs. The fine-tuning process is extremely slow for me. I'm using fp16 and attn_impl: torch, with a global_train_batch_size of 12 and device_train_microbatch_size automatically...

Louis-y-nlp

make triton attn req pre-mlri tagged triton

1

I fork triton and rename it at `triton_pre_mlri`, triton diff [here](https://github.com/openai/triton/compare/main...vchiley:triton:triton_pre_mlir) llmfoundry/models/layers/flash_attn_triton.py is copy pasta from [HazyResearch flash_attn_triton](https://github.com/HazyResearch/flash-attention/blob/main/flash_attn/flash_attn_triton.py) where I modify imports to be ``` import triton_pre_mlir as triton import...

vchiley

Docker Image with CUDA 12.1 for ADA Gen cards

1

Docker Image with CUDA 12.1 for ADA Gen cards

danzeeeman

Fine-Tuning a model with SageMaker

1

## ❓ Question I want to fine-tune the model with SageMaker. Is there a guide how to do it? I have a dataset that I want to fine-tune the model...

greenpau

question

Cosine update grad is inf in some blocks?

I'm observe my optimizer metrics while mpt trains, and some blocks are infs e.g. `Train cosine/update_grad/model._fsdp_wrapped_module.transformer.blocks.9._fsdp_wrapped_module.ffn.down_proj.weight: inf` It's ok or why this happens? Do you know this issue? I guess...

germanjke

Typeerror while converting to streaming format

2

I am trying to convert Redpajama-github dataset to streaming format but getting the error as below. To replicate: python llm-foundry/scripts/data_prep/convert_dataset_json.py \ --path github/split1 \ --out_root github/split1 --split train \ --concat_tokens...

nikhilranjan7

bug

Can't produce same answer

## ❓ Question ## Additional context I'm confused with demo of https://huggingface.co/spaces/mosaicml/mpt-7b-instruct and github script inference/hf_chat.py the later seems stupid .... for example please write a java function to query...

apachemycat

question

No any feedback about getting access to MosaicML platform

Hello! I hope you are doing well! I've requested access twice but I didn't get any answer or feedback Please, provide some email which I can use to re-send again...

kotikkonstantin

question

Add 8-bit LION optimizer

3

Adds an 8-bit version of the LION optimizer. Some non-obvious aspects of this include: - CUDA kernels for int8 quantizing and dequantizing floats. Kernels use numba since I got stonewalled...

dblalock

Remote JSONL IFT data

1

Support remote jsonl files for finetuning.

samhavens

llm-foundry
llm-foundry copied to clipboard

Metadata

Slow on V100

make triton attn req pre-mlri tagged triton

Docker Image with CUDA 12.1 for ADA Gen cards

Fine-Tuning a model with SageMaker

Cosine update grad is inf in some blocks?

Typeerror while converting to streaming format

Can't produce same answer

No any feedback about getting access to MosaicML platform

Add 8-bit LION optimizer

Remote JSONL IFT data

← Metadata

Owner

Metadata

llm-foundry llm-foundry copied to clipboard

Metadata

← Metadata

Owner

Metadata

llm-foundry
llm-foundry copied to clipboard