llm-foundry icon indicating copy to clipboard operation
llm-foundry copied to clipboard

LLM training code for Databricks foundation models

Results 267 llm-foundry issues
Sort by recently updated
recently updated
newest added

Hi, I could do with a pointer on what's going wrong here. I've follows instructions, and somehow ended up with Torch 1.13.1 when I think it needs 2.x Cheers, J....

Initializing model... Traceback (most recent call last): File "/content/llm-foundry/llmfoundry/models/mpt/modeling_mpt.py", line 619, in __init__ from flash_attn.losses.cross_entropy import CrossEntropyLoss as FusedCrossEntropyLoss # type: ignore # isort: skip File "/usr/local/lib/python3.10/dist-packages/flash_attn/losses/cross_entropy.py", line 9, in...

Hi The __zeroshot__ performance on BoolQ in LLaMA paper is 76.5. While the llm-foundry only 62.16 (zero-shot) when following `tasks.yaml`. The result in blog is a few-shot ? How about...

It would be nice to have the model supported by GGML, so as to make quantized versions of it or future derivatives also run without GPU. See https://github.com/ggerganov/llama.cpp/issues/1333#issuecomment-1536725381 I gave...

using `scripts/train/train.py` yaml: yamls/mpt/finetune/7b_dolly_sft.yaml

HuggingFace -> Hugging Face

Hi, I saw in mpt model card that the models could run with FasterTransformer I didn't find any details about that anywhere can you guys share the conversion scripts or...

Hi Team, I tried the finetuning code given in repo with 7b_dolly_sft.yaml, I ran for one epoch. Please find the details below: [epoch=1][batch=927/927]: Train time/batch: 926 Train time/sample: 59238 Train...

how about the multilingual ability of MPT

Another noob question... Is is possible to reduce the resource burden for fine tuning by using Peft/LoRa techniques? If not will it be possible in the future with MPT models?