llm-foundry issues

8

ran `composer train/train.py train/yamls/pretrain/mpt-3b.yaml` also with `model.fc_type=te` and `precision=amp_fp8` Result: ``` torch: throughput/device/tokens_per_sec: 23.7k te: throughput/device/tokens_per_sec: 23.7k te with fp8: throughput/device/tokens_per_sec: 29.4k ``` Note there does seem to be this...

vchiley

Add `device_map` support for `hf_generate.py` and `hf_chat.py`

3

This PR enables `--device_map auto` which enables using these scripts with very large models that don't fit on a single GPU. This also removes the need for FSDP support @alextrott16...

abhi-mosaic

Configure eval to give 'loss/eval' that is analgous to 'loss/train'

4

When I run with an eval set, I only get metrics/eval. I am wondering if there is a way to configure llm-foundry via yaml to also compute loss/eval in the...

tginart

Fix convert_dataset_hf.py hanging with excessive num_workers

3

Background: PyTorch's DataLoader hangs on several machines (locally, VM, colab) because of the `num_workers` argument being excessive. Generally, when using multiple processes, we want to scale with the number of...

casper-hansen

ERROR:composer.cli.launcher:Rank 2 crashed with exit code -7

2

I am using [g5.12xlarge](https://instances.vantage.sh/aws/ec2/g5.12xlarge) instance on AWS with 96 GB of GPU memory. I am attempting to finetune a model on a custom dataset. To accomplish this, I created a...

tb852

Fix Typing (part 1)

This is the first in a series of PRs that brings this library into compliance with `pyright`. No functional changes to the code should occur with these fixes. Before: ```...

hanlint

Inference with triton doesn't supported?

7

I'm trying to use `hf_generate.py`, why it's not working with flag `--attn_impl triton`? also changed in `convert_composer_to_hf.py` to `config.attn_config['attn_impl'] = 'triton'` from `torch` ```ValueError: Requirements for `attn_impl: triton` not installed....

germanjke

Model loading on local machine

4

## ❓ **Question** i am trying to use mode through Hugging face pipe line but model didn't load, my code line is llm = HuggingFacePipeline.from_model_id(model_id='mosaicml/mpt-7b-instruct',task="text-generation",trust_remote_code=True) ValueError: Loading mosaicml/mpt-7b-instruct requires you...

Devangkaruskar

question

Onboarding tutorial and related improvements

This PR includes a handful of onboarding/tutorial resources and improvements, with the majority of the change being a new `TUTORIAL.md` file that is meant to provide a more in depth...

alextrott16

llm-foundry
llm-foundry copied to clipboard

Metadata

Adding custom embedding

adding te Linear for fp8 support

Add `device_map` support for `hf_generate.py` and `hf_chat.py`

Configure eval to give 'loss/eval' that is analgous to 'loss/train'

Fix convert_dataset_hf.py hanging with excessive num_workers

ERROR:composer.cli.launcher:Rank 2 crashed with exit code -7

Fix Typing (part 1)

Inference with triton doesn't supported?

Model loading on local machine

Onboarding tutorial and related improvements

← Metadata

Owner

Metadata

llm-foundry llm-foundry copied to clipboard

Metadata

← Metadata

Owner

Metadata

llm-foundry
llm-foundry copied to clipboard