Carlos Mocholí comments

Results 428 comments of


                                            Carlos Mocholí

RuntimeError: probability tensor contains either inf, nan or element < 0

> I am running in 16 bit. `--precision 16-mixed` or `--precision 16-true`?

RuntimeError: probability tensor contains either inf, nan or element < 0

`bf16-true` will most likely fix it

Add support for MPT

I looked into implementing this ([branch](https://github.com/Lightning-AI/lit-gpt/compare/carmocca/mpt?expand=1)). The missing pieces are: - ALiBi - Low precision LayerNorm And to reproduce training, they also do - Tied embeddings weights with lm_head -...

Unable to do inference on Falcon-40b fine-tuned using LORA

You are using FSDP for inference, right? It won't fit in a single 80GB card. How many devices are you using?

Unable to do inference on Falcon-40b fine-tuned using LORA

The model won't fit into any single 80GB card unless you do quantization. So either you do that or the model needs to be sharded by using FSDP. I'm don't...

Finetune Falcon-40B with adapter_v2.py using 8 A100 80GB GPUs

LoRA distributed support is tracked in #161

Finetune Falcon-40B with adapter_v2.py using 8 A100 80GB GPUs

Regarding training falcon 40b on 8 A100 80GB GPUs, I don't have access to that hardware, but you can try the suggestions in https://github.com/Lightning-AI/lit-gpt/blob/main/tutorials/oom.md. You'll need to use [sharding](https://github.com/Lightning-AI/lit-gpt/blob/main/tutorials/oom.md#do-sharding-across-multiple-gpus) as...

Carlos Mocholí

RuntimeError: probability tensor contains either inf, nan or element < 0

RuntimeError: probability tensor contains either inf, nan or element < 0

Add support for MPT

Unable to do inference on Falcon-40b fine-tuned using LORA

Unable to do inference on Falcon-40b fine-tuned using LORA

Finetune Falcon-40B with adapter_v2.py using 8 A100 80GB GPUs

Finetune Falcon-40B with adapter_v2.py using 8 A100 80GB GPUs

Finetune Falcon-40B with adapter_v2.py using 8 A100 80GB GPUs

OutOfMemory error when finetuning Falcon7b on custom dataset

Blur on "Warn" filters doesn't always blur well.