Sebastian Raschka comments

Results 628 comments of


                                            Sebastian Raschka

This codebase has so many errors it is completely useless and unusable

@Abecid What error are you getting with bfloat16. I think it's only supported in Ampere and newer, but it appears that it now also works on older T4's and CPU....

Using llama3 through lit lama

While this repository is only focused on the first Llama model to keep the code as simple and readable as possible, we have the [LitGPT repository](https://github.com/Lightning-AI/litgpt) (which is an extension...

Using llama3 through lit lama

I just saw your comment also in https://github.com/Lightning-AI/litgpt/issues/1333. Let's continue the discussion there.

How to convert lit-llama pretrained model to HF format?

I am not entirely sure, but https://github.com/Lightning-AI/lit-llama/blob/main/scripts/convert_checkpoint.py might be doing that

Looking for LLaMA 2?

Yes, full finetuning is supported via [finetune/full.py](https://github.com/Lightning-AI/lit-gpt/blob/main/finetune/full.py) script given a Llama 2 model provided via the `--checkpoint_dir` in Lit-GPT. You can also use a custom dataset given that you prepare...

(documentation) How do I know if generate.py is running on GPU / GPU configuration

In general, if you start a new Python session, does ```python import torch print(torch.cuda.is_available()) ``` show `True`?

multi gpus for full finetune

I don't have a good explanation, but maybe you accidentally set `devices = 1` here?

multi gpus for full finetune

There might be SLURM (not Lit-LLaMA-specific) problem with requesting the GPUs. You could add the following PyTorch code at the top to see if the machine indeed has multiple GPUs...

full finetuning of LLaMA 7B: OOM on A100

Hm, I definitely remember training it ... could you try the following and see if it works? ```python micro_batch_size = 2 ``` or ```python micro_batch_size = 1 ```

[question] error message while finetuning

It may or may not be related, but are you using `--precision 16-true`? I noticed that for training some models it results in NaNs during training. If your GPU supports...