lammoh issues

Results 15 issues of


                                            lammoh

Falcon Loss Not Decreasing During Training

I'm using pretrain code with falcon 7B. I've noticed that the loss didn't change for 400 iterations. ``` iter 1: loss 11.0666, time: 13381.00ms, speed: 306 toks/s/device .... iter 400:...

Training time is unexpectedly very slow compared to lit-llama

Hello, I'm using the pretrain code to train falcon-7B, I've already used lit-llama and trained llama-7B. I noticed that falcon is very slow compared to llama, and it takes more...

Support for BLOOM

I'd like to request a support for BLOOM as it was pretrained on many languages

enhancement

checkpoints

converting Adapter to huggingface format

based on the discussion here: https://github.com/Lightning-AI/lit-llama/pull/435#issuecomment-1667966748, the current code can only convert the base model into huggingface format. For converting adapter we a different code, I'd like to request your...

about "PackedDataset"

Hi everyone, Do you have resources that could help me understand "PackedDataset"? I'm trying to implement two things: **(1) Multiprocessing script for the tokenization:** which is done, I implemented the...

Fixing the pretrain script for Loss Averaging and no_backward_sync()

I think #357 should be applied to the pretrain script as well. Thank you so much lightning team for this amazing repository.

changing `devices` to `fabric.world_size` in the pretrain code

Hello, according to our discussion [here](https://github.com/Lightning-AI/lit-llama/issues/330#issuecomment-1567376696), I think `devices` should be changed in the [pretraininig code](https://github.com/Lightning-AI/lit-llama/blob/main/pretrain/redpajama.py#L117) to `fabric.world_size`, since the batch size refers to the global batch size. `devices` in...

lammoh