llm-foundry issues

[FEATURE] Return attention_mask for GPTQ to work

3

## 🚀 Feature Request I have been investigating how we can make GPTQ work in order to quantize MPT models. It seems that a lot of progress has been made...

casper-hansen

enhancement

add ActivationMonitor callback

Add ActivationMonitor callback from composer

dayyass

MPT for Sequence Classification

1

I'm interested in using `llm-foundry` infrastructure for training LLMs for sequence classification/regression tasks. I currently have a fork of `llm-foundry` where I got this working (in a fairly hacky manner...

boomanaiden154

Finetuning Models

2

I followed the tutorial at `train/finetune_example/mpt-7b-arc-easy--gpu.yaml` and added an additional evaluation using `icl_tasks: 'eval/yamls/tasks_light.yaml'` in order to evaluate accuracy on ARC Easy. As the model finetuned, training loss decreased, but...

ak2028

bug

[Bug] Different batch_size return different evaluating result

1

## Environment ```bash Collecting system information... --------------------------------- System Environment Report Created: 2023-08-21 17:44:51 CST --------------------------------- PyTorch information ------------------- PyTorch version: 2.0.1+cu117 Is debug build: False CUDA used to build PyTorch:...

SingL3

bug

`convert_dataset_json.py` does not respect parameter `--split`

1

https://github.com/mosaicml/llm-foundry/blob/bd8127252c660e45ed01413645d29427f86c085a/scripts/data_prep/convert_dataset_json.py#L204C4-L204C4 `out=os.path.join(args.out_root),` should be `out=os.path.join(args.out_root, folder_split),` as in `convert_dataset_hf.py`.

matthiasgeihs

Unify export script arguments after next version release

## 🚀 Feature Request unify export script arguments after next version of LLM-Foundry to improve UX, but not immediately so we don't break people's workflows. ## Motivation inference scripts have...

codestar12

enhancement

Incorrect use of `dtype` resulting into incorrect `token_ids`

1

In the [\_\_iter\_\_](https://github.com/mosaicml/llm-foundry/blob/main/llmfoundry/data/data.py#L116) method of the `ConcatTokensDataset` class, the `dtype` argument is not specified for the statement `yield {'tokens': np.asarray(concat_sample).tobytes()}`. The default dtype used by numpy is `np.int32`. On the...

damin604

bug

eval.py error while benchmarking T5

1

## Console [Eval batch=1/1289] Eval on lambada_openai/0-shot data [Eval batch=130/1289] Eval on lambada_openai/0-shot data [Eval batch=259/1289] Eval on lambada_openai/0-shot data [Eval batch=387/1289] Eval on lambada_openai/0-shot data [Eval batch=516/1289] Eval on...

sigjhl

bug

Gradient checkpointing issue when running QLoRA finetuning

1

Finetuning the mpt-7b and mpt-30b using qlora gives the error "ValueError: MPTForCausalLM does not support gradient checkpointing.". Is there a way to fix this?

tytung2020

question

llm-foundry
llm-foundry copied to clipboard

Metadata

[FEATURE] Return attention_mask for GPTQ to work

add ActivationMonitor callback

MPT for Sequence Classification

Finetuning Models

[Bug] Different batch_size return different evaluating result

`convert_dataset_json.py` does not respect parameter `--split`

Unify export script arguments after next version release

Incorrect use of `dtype` resulting into incorrect `token_ids`

eval.py error while benchmarking T5

Gradient checkpointing issue when running QLoRA finetuning

← Metadata

Owner

Metadata

llm-foundry llm-foundry copied to clipboard

Metadata

← Metadata

Owner

Metadata

llm-foundry
llm-foundry copied to clipboard