Andrei-Aksionov comments

Results 70 comments of


                                            Andrei-Aksionov

Drop interleave placement in QKV matrix

One recommendation, if we want to merge it in some future. Since we need to deal with legacy checkpoints and somehow determine if we need to reassemble it to "non-interleaved"...

Drop interleave placement in QKV matrix

I've just called Andrej, and he doesn't mind if we rename it to `qkv`.

Add OLMo: 1B & 7B

Hello @jwkirchenbauer I think #1341 is architectually similar to Olmo, so after it is merged, it should be much easier to implement olmo. Somewhere next week, maybe. Depends on when...

Add OLMo: 1B & 7B

_**A tale of a PR that rose from ashes.**_ 🙂 I tested both model, they generate normally looking text, though cannot answer a question. As I understand, these models aren't...

It looks like the instruct variant expects a prompt in a specific format: ```python from transformers import AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("allenai/OLMo-7B-Instruct-hf") message = [{"role": "user", "content": "{prompt}"}] display(tokenizer.apply_chat_template(message, tokenize=False, add_generation_prompt=True))...

Add MPS configs

I can confirm that the issue is not with the `torch` and `mps`:

Is it correct to keep using adapter_kv_cache during training in litgpt/adapter.py?

Hello @RookieXwc Thanks for the question. To me it also looks strange. But I would like to have a confirmation from @carmocca that we don't miss anything.

PermissionError: [WinError 5]

Windows is odd. What makes it a bit more "spicy" is that, apparently, none of the maintainers have Windows machine. Debugging it via CI is sub-optimal, to say at least....

PermissionError: [WinError 5]

I have already tried it :) And it didn't work. ~I'll create a PR a bit later~ Sebastian was faster.

Test running on MPS / ARM NEON

https://www.apple.com/shop/buy-mac/mac-pro/rack ![Untitled](https://github.com/Lightning-AI/lit-llama/assets/58434077/d8fd5c93-7e97-4bc0-9971-87f0b20ffe73)