llm-foundry issues

How to support new ICL task types in own codebase

9

Hi, I want to add a custom ICL task type which corresponds to a new ICL metric in my codebase. Currently we have created our own patch of the training...

sanjari-orb

add transformer engine fp8 attention

This PR adds the TransformerEngine fp8 attention implemention. https://github.com/NVIDIA/TransformerEngine/blob/main/transformer_engine/pytorch/attention.py. To enable it, set the below fields in the yaml config. ``` precision: amp_fp8 model: attn_config: attn_type: te_multihead_attention kv_n_heads: 8 fc_type:...

cli99

Support Remote and HF promptfiles in hf_generate script

Small QoL improvement for testing generation. Uses composer's `get_file` to support remote prompt files, and adds a bit of syntax for pointing to huggingface hub datasets as well. `-p file::/local/path`...

samhavens

[wip] brier score

Brier score seems of questionable usefulness. COPA results: First number for each model is Brier score. Below we find that accuracy AND brier score both go up with model size...

bmosaicml

Converted PrefixLM HF snapshot must enable cache for generation in config

2

## Environment - `llmfoundry:latest` ## To reproduce Steps to reproduce the behavior: 1. Train a prefix-lm 2. Convert it to Huggingface via `llm-foundry/scripts/inference/convert_composer_to_hf.py` 4. Try to generate texts with the...

timsteuer

bug

Add tensor parallelism for attention QKVO

3

This PR implements tensor parallelism using PyTorch's new DTensor library. This can be either used standalone or in 2D parallel fashion along with other parallelism strategies like FSDP. It partitions...

linden-li

PrefixLM is loaded as CausalLM after HuggingFace export

7

When loading a prefix-lm model trained with llm-foundry into HuggingFace, one is tempted to do an `AutoModelForCausalLM.from_pretrained()`. However, this loads the model not as a prefix-lm but as a causal...

timsteuer

bug

`eval.py` hangs when config yaml's model hparams don't match model checkpoint hparams

## Environment 0: Collecting system information... 0: --------------------------------- 0: System Environment Report 0: Created: 2023-11-21 21:17:06 UTC 0: --------------------------------- 0: 0: PyTorch information 0: ------------------- 0: PyTorch version: 2.1.0+cu121 0:...

growlix

bug

Converting a composer seq2seq t5 model throws an exception

3

## Environment - llm-foundry: latest ## To reproduce Steps to reproduce the behavior: 1. train a `hf_t5` model 2. download the composer checkpoint 3. try to convert it back to...

timsteuer

bug

Benchmarking GLUE tasks for in-context learning

2

## ❓ Question I am trying to benchmark `llama-2-7b` on the GLUE benchmark for in-context learning. But the accuracy I get for MNLI (`mismatched validation`) is 35.22 for both zero-shot...

ashim95

question

llm-foundry
llm-foundry copied to clipboard

Metadata

How to support new ICL task types in own codebase

add transformer engine fp8 attention

Support Remote and HF promptfiles in hf_generate script

[wip] brier score

Converted PrefixLM HF snapshot must enable cache for generation in config

Add tensor parallelism for attention QKVO

PrefixLM is loaded as CausalLM after HuggingFace export

`eval.py` hangs when config yaml's model hparams don't match model checkpoint hparams

Converting a composer seq2seq t5 model throws an exception

Benchmarking GLUE tasks for in-context learning

← Metadata

Owner

Metadata

llm-foundry llm-foundry copied to clipboard

Metadata

← Metadata

Owner

Metadata

llm-foundry
llm-foundry copied to clipboard