Vitaliy Chiley

https://www.linkedin.com/in/vitaliychiley/

Meta LLaMa Herder

Results 64 comments of


                                            Vitaliy Chiley

Fused Cross Entropy is not installed. Either (1) have a CUDA-compatible GPU and `pip install .[gpu]`, or (2) set your config model.loss_fn=torch_crossentropy.

Yes the current setup depends on torch==1.13.1 [torch2 has some issues](https://github.com/mosaicml/llm-foundry/issues/57#issuecomment-1537225586) but we will upgrade when these are resolved (Note: people have run our repo with torch2, but it does...

FasterTransformer

MPT is a GPT style network You'd want to create a conversion script, similar to [this one](https://github.com/NVIDIA/FasterTransformer/blob/c6e8f60ec40da218804a60e6aa986903e7fa8594/examples/pytorch/gpt/utils/huggingface_gpt_convert.py), for converting the MPT HF model into the FT format. When we write...

FasterTransformer

[The `*.c_*.*` naming](https://github.com/NVIDIA/FasterTransformer/blob/c6e8f60ec40da218804a60e6aa986903e7fa8594/examples/pytorch/gpt/utils/huggingface_gpt_convert.py#L135) makes me think they use 1x1 conv layers instead of linear layers (functionally the same thing, for some reason early transformer implementations use to do this; e.g....

FasterTransformer

> Transformer Engine We've played around with [TE and H100 FP8 support](https://www.mosaicml.com/blog/coreweave-nvidia-h100-part-1) It works and we'll include everything when we have more seat time with H100s so we can test...

How to install MPT-7B?

> change triton_flash_attention to flash_attention in yaml is a solution of this problem~ flash attn does not support alibi. you're technically not running the correct model...

How to install MPT-7B?

What version of torch are you using? are you following the [requirements](https://github.com/mosaicml/llm-foundry#prerequisites) and installation instructions? (which say use torch1.13) (this is an issue you would have if you are using...

How to install MPT-7B?

the triton installed with torch2 made breaking changes to the triton version we need for our training setup. please use torch1.13 as instructed in the [requirements](https://github.com/mosaicml/llm-foundry#prerequisites). torch2 is NOT supported...

How to install MPT-7B?

[torch2 now works](https://github.com/mosaicml/llm-foundry/pull/149) 🥳 Note: our setup does install 2 versions of triton, please follow the install instructions Closing issue, if you still have issues, feel free to re-open the...

Evaluation result mismatch

What setup are you using? is it multi-gpu? `composer SCRIPT` will launch the script across multiple GPUs; `composer SCRIPT` will not, but if there are multiple gpus in your setup...

Finetune MPT models with local dataset

see https://github.com/mosaicml/llm-foundry/issues/143#issuecomment-1553334904

‹
1
2
3
4
5
6
7
›