David Quarel

Results 3 issues of David Quarel

**Describe the bug** When loading Llama-2, throws a "RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!" error. **Code example**...

For any tokens following ` 私` the result fails to be tokenized using the tokenizer meta-llama/Meta-LLama-3-8B. ![Image](https://github.com/user-attachments/assets/73f16cf9-6942-47d4-a65c-9ec3f2df36cc) ![Image](https://github.com/user-attachments/assets/05cc2ef4-2333-428b-baa2-21de5e24f0ac) This doesn't match the behaviour of the huggingface tokenizer ![Image](https://github.com/user-attachments/assets/9e5df98b-ef16-4d13-9e41-64330518fbdd) ``` >>>...

**Describe the bug** PosEmbed uses incorrect device when used with `accelerate` library **Code example** The following code is a minimal training loop that trains the `gpt2` model on randomly generated...