llm-foundry
llm-foundry copied to clipboard
LLM training code for Databricks foundation models
Hi, I want to add a custom ICL task type which corresponds to a new ICL metric in my codebase. Currently we have created our own patch of the training...
This PR adds the TransformerEngine fp8 attention implemention. https://github.com/NVIDIA/TransformerEngine/blob/main/transformer_engine/pytorch/attention.py. To enable it, set the below fields in the yaml config. ``` precision: amp_fp8 model: attn_config: attn_type: te_multihead_attention kv_n_heads: 8 fc_type:...
Small QoL improvement for testing generation. Uses composer's `get_file` to support remote prompt files, and adds a bit of syntax for pointing to huggingface hub datasets as well. `-p file::/local/path`...
Brier score seems of questionable usefulness. COPA results: First number for each model is Brier score. Below we find that accuracy AND brier score both go up with model size...
## Environment - `llmfoundry:latest` ## To reproduce Steps to reproduce the behavior: 1. Train a prefix-lm 2. Convert it to Huggingface via `llm-foundry/scripts/inference/convert_composer_to_hf.py` 4. Try to generate texts with the...
This PR implements tensor parallelism using PyTorch's new DTensor library. This can be either used standalone or in 2D parallel fashion along with other parallelism strategies like FSDP. It partitions...
When loading a prefix-lm model trained with llm-foundry into HuggingFace, one is tempted to do an `AutoModelForCausalLM.from_pretrained()`. However, this loads the model not as a prefix-lm but as a causal...
## Environment 0: Collecting system information... 0: --------------------------------- 0: System Environment Report 0: Created: 2023-11-21 21:17:06 UTC 0: --------------------------------- 0: 0: PyTorch information 0: ------------------- 0: PyTorch version: 2.1.0+cu121 0:...
## Environment - llm-foundry: latest ## To reproduce Steps to reproduce the behavior: 1. train a `hf_t5` model 2. download the composer checkpoint 3. try to convert it back to...
## ❓ Question I am trying to benchmark `llama-2-7b` on the GLUE benchmark for in-context learning. But the accuracy I get for MNLI (`mismatched validation`) is 35.22 for both zero-shot...