Carlos Mocholí

Results 428 comments of Carlos Mocholí

Bumping this! I see the are 3.9 docker images published at https://gcr.io/tpu-pytorch/xla so having wheels for them would be a nice next step

We can choose the precision based on whether deepspeed is used. I guess @awaelchli manually changed the precision value when trying out deepspeed in https://github.com/Lightning-AI/lit-llama/pull/128 (where this code originally comes...

Instead of an `--interactive` flag, it would be better to add a `chat_adapter.py` script that supports it and streaming the output. Since this adds quite a bit of logic, it's...

Adding ```python roped = (x * cos) + (rotated * sin) return roped.type_as(x) ``` Fixes the error above, but the test still fails. Generation looks fine though ```python pytest tests/test_model.py::test_model_bfloat16...

cc @t-vi, maybe you can catch this bug easily as you wrote this test originally

Can you share exactly what script you ran?

I don't think so. But you might need to tweak hyperparameters. This is the dark art of machine learning :wink:

Do you still see this behaviour, and if so, can you share exactly the code you ran and the arguments passed?

This is because LLaMA fine-tuning is hardcoded to use `256` max_seq_length: https://github.com/Lightning-AI/lit-llama/blob/main/scripts/prepare_alpaca.py#L26 https://github.com/Lightning-AI/lit-llama/blob/main/finetune/adapter.py#L52 Whereas this repository is configured to use the longest sequence length in alpaca: `1037`. If you override...

That is a good observation. I agree that we should remove the `max_seq_length` value passed to the model forward and separate it from the max_length used to split the dataset....