llm-foundry issues

GPU memory issue

2

I used the train script from README to kick off 125M model training for 10 batches, exactly as in the example and was surprised to see this used almost all...

zhranj

I tried to run the 0-shot evaluation on winograd for the MPT-7b model. This is the result that I got: Eval metrics/winograd/0-shot/InContextLearningMultipleChoiceAccuracy: 0.5055 This is the script that I use:...

congyingxia

how to run in v100 GPU

5

when I run with “composer train.py yamls/mpt/125m.yaml train_loader.dataset.split=train_small eval_loader.dataset.split=val_small”， I get the error，my GPU is V100 ****************************** Traceback (most recent call last): File "", line 21, in _bwd_kernel KeyError: ('2-.-0-.-0-1e8410f206c822547fb50e2ea86e45a6-2b0c5161c53c71b37ae20a9996ee4bb8-c1f92808b4e4644c1732e8338187ac87-42648570729a4835b21c1c18cebedbfe-12f7ac1ca211e037f62a7c0c323d9990-5c5e32ff210f3b7f56c98ca29917c25e-06f0df2d61979d629033f4a22eff5198-0dd03b0bd512a184b3512b278d9dfa59-d35ab04ae841e2714a253c523530b071',...

sysusicily

Windows support ?

6

Hello I have been unable to run the model on Windows since the install fails as it requires Triton that is only supported on Linux. Any idea ? Thanks in...

deepbeepmeep

Generate shorter sentences

1

I using mpt-intruct using hf. I need to generate shorter sentences. I'm using mpt-instruct for creating titles. I need to restrict the length of the sentences. what options are available?...

NarenZen

What are the hardware requirements?

See title lol

soloist-tech

Error in FSDP with composer

2

When finetuning an MPT-7B model with 8gpus, I get the following error when training is about to begin (after model and dataset loading etc.): ``` Traceback (most recent call last):...

bjoernpl

How to use the train.py finetuning the pre-trained MPT-7B?

1

It seems like we need to perform the pre-trainning process to get the checkpoint file before we can fine-tuning the MPT model. 1b_local_data_sft.yaml mentioned that we have to replace the...

metacarbon

Broken on docker image?

6

I am trying to follow the Quickstart guide on the mosaicml/pytorch docker image and running into issues when trying the exact commands. The training step is broken. In particular, there...

tginart

How to install torch 1.13.1+cu117?

3

`pip3 install torch `does not install torch + support for cuda 11.7. Therefore I'm not able to install all requirements in a new venv.` pip list` lists that torch version...

ighodgao

llm-foundry
llm-foundry copied to clipboard

Metadata

GPU memory issue

Evaluation result mismatch

how to run in v100 GPU

Windows support ?

Generate shorter sentences

What are the hardware requirements?

Error in FSDP with composer

How to use the train.py finetuning the pre-trained MPT-7B?

Broken on docker image?

How to install torch 1.13.1+cu117?

← Metadata

Owner

Metadata

llm-foundry llm-foundry copied to clipboard

Metadata

← Metadata

Owner

Metadata

llm-foundry
llm-foundry copied to clipboard