Awni Hannun
Awni Hannun
That is very odd. The tokenizer copying is very simple in MLX LM. We basically load with Hugging Face and then save it with Hugging Face. There is no MLX...
@fblissjr you can reproduce the behavior with: ```python from transformers import AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("CohereForAI/c4ai-command-r-plus") tokenizer.save_pretrained(".") ``` I feel that should not break the tokenizer.. so it might be worth...
Curious.. what machine are you on? What OS?
The command runs fine for me with our default dataset: ``` python -m mlx_lm.lora \ --model google/gemma-2b-it \ --train \ --data ../lora/data \ --iters 600 --adapter-path . ```
Hmm, that's annoying. It's definitely not from MLX, we don't use open mp. If I had to guess, probably from `numba`.
I'm not sure.. I never saw that error before. It seems related to too much resource use (e.g. OOM). Does it run if you use a smaller batch size? `--batch-size=1`?
I'm running the command you shared.
@danny-su what version of MLX / MLX LM are you using: `python -c "import mlx.core as mx; print(mx.__version__)"` If it's not the latest, please update and try again. So far...
> I can use the same size data to fine-tune Mistral-7B-Instruct-v0.2 Quantized or fp16?
Wow that is a really long sequence length: `102400`. I can't imagine you have enough memory on your machine for that long of a sequence length. Just the attentions scores...