Carlos Mocholí comments

Results 428 comments of


                                            Carlos Mocholí

Continue downloads

> Thanks, that seems to work. Should I make a merge request on the documentation to clarify that? If you think that would be helpful for others, then go for...

Cuda OOM : Falcon 7B with block size 3072

I'm replacing DeepSpeed with FSDP in #118. Feel free to try it out and see if it helps before the PR is merged.

Convert lit-parrot model format back to huggingface format

There's two ways to do this. Either having the opposite operations of https://github.com/Lightning-AI/lit-gpt/blob/main/scripts/convert_hf_checkpoint.py#L19-L169 for each of the HuggingFace classes, or creating a HF Transformer model version of `lit_gpt.model`. The former...

probability tensor contains either `inf`, `nan` or element < 0

Try passing `--precision bf16-mixed` or `--precision 16-mixed`. I just made the switch in the default with #175

[TPU] added numbers for TPU FLOPS

Oh yes, you're right. We multiply this number by the world size, so we don't want the number of cores: https://github.com/Lightning-AI/lit-gpt/blob/main/lit_parrot/speed_monitor.py#L223

OOM when fine-tune the Falcon-7B

Did you try reducing your `micro_batch_size`? We have a guide for OOMs in https://github.com/Lightning-AI/lit-gpt/blob/main/howto/oom.md Running `adapter.py` with current main, falcon-7b, precision=16-true, micro_batch_size=1 should use 22.69 GB max allocated memory

Carlos Mocholí

Continue downloads

Cuda OOM : Falcon 7B with block size 3072

Convert lit-parrot model format back to huggingface format

probability tensor contains either `inf`, `nan` or element < 0

[TPU] added numbers for TPU FLOPS

OOM when fine-tune the Falcon-7B

OOM when fine-tune the Falcon-7B

OOM when fine-tune the Falcon-7B

Fine tune last step doesn't work, config.json required

RuntimeError: probability tensor contains either inf, nan or element < 0