unsloth
unsloth copied to clipboard
Llama-3 now supported
Colab for Llama-3 8b: https://colab.research.google.com/drive/135ced7oHytdxu3N2DNe1Z0kqjyYIkDXp?usp=sharing
Edit 2: I'm not sure why it worked that time, but it's back. Probably something to do with my environment?
~~Edit: running pip install --no-deps packaging ninja einops flash-attn xformers trl peft accelerate bitsandbytes
over my conda environment fixed it.~~
Thanks @danielhanchen!
I changed the model from mistral to llama-3 in my training script based on the ChatML notebook from the readme (gist of code and full log), and it went from working to crashing on trainer.train()
:
File "/home/tnunamak/applications/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/llama.py", line 432, in LlamaDecoderLayer_fast_forward
hidden_states = fast_rms_layernorm(self.input_layernorm, hidden_states)
File "/home/tnunamak/applications/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/rms_layernorm.py", line 190, in fast_rms_layernorm
out = Fast_RMS_Layernorm.apply(X, W, eps, gemma)
File "/home/tnunamak/applications/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/torch/autograd/function.py", line 553, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
File "/home/tnunamak/applications/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/rms_layernorm.py", line 144, in forward
fx[(n_rows,)](
File "/home/tnunamak/applications/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/triton/runtime/jit.py", line 550, in run
bin.c_wrapper(
File "/home/tnunamak/applications/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/triton/compiler/compiler.py", line 692, in __getattribute__
self._init_handles()
File "/home/tnunamak/applications/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/triton/compiler/compiler.py", line 683, in _init_handles
mod, func, n_regs, n_spills = fn_load_binary(self.metadata["name"], self.asm[bin_path], self.shared, device)
RuntimeError: Triton Error [CUDA]: device-side assert triggered
Aborted (core dumped)
Any ideas?
@tnunamak Yes can reproduce - sorry working on a fix!
Edit 2: I'm not sure why it worked that time, but it's back. Probably something to do with my environment?
~Edit: running
pip install --no-deps packaging ninja einops flash-attn xformers trl peft accelerate bitsandbytes
over my conda environment fixed it.~Thanks @danielhanchen!
I changed the model from mistral to llama-3 in my training script based on the ChatML notebook from the readme (gist of code and full log), and it went from working to crashing on
trainer.train()
:File "/home/tnunamak/applications/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/llama.py", line 432, in LlamaDecoderLayer_fast_forward hidden_states = fast_rms_layernorm(self.input_layernorm, hidden_states) File "/home/tnunamak/applications/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/rms_layernorm.py", line 190, in fast_rms_layernorm out = Fast_RMS_Layernorm.apply(X, W, eps, gemma) File "/home/tnunamak/applications/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/torch/autograd/function.py", line 553, in apply return super().apply(*args, **kwargs) # type: ignore[misc] File "/home/tnunamak/applications/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/rms_layernorm.py", line 144, in forward fx[(n_rows,)]( File "/home/tnunamak/applications/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/triton/runtime/jit.py", line 550, in run bin.c_wrapper( File "/home/tnunamak/applications/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/triton/compiler/compiler.py", line 692, in __getattribute__ self._init_handles() File "/home/tnunamak/applications/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/triton/compiler/compiler.py", line 683, in _init_handles mod, func, n_regs, n_spills = fn_load_binary(self.metadata["name"], self.asm[bin_path], self.shared, device) RuntimeError: Triton Error [CUDA]: device-side assert triggered Aborted (core dumped)
Any ideas?
I had this same issue when trying to work around the eos token (<|eot_id|>
) issue manually.
@sion42x Yep working on a fix - i think ill push it in today - much apologies on the issue!
@tnunamak @sion42x Fixed!! On a local machine please reinstall ie via:
pip uninstall unsloth -y
pip install --upgrade --force-reinstall --no-cache-dir git+https://github.com/unslothai/unsloth.git
For Colab / Kaggle, restart and run again :) Sorry on the issue!
The issue still exists in the new update I think. Did any one solve the problem? Thnx
@emreekmekcioglu1 Did you try reinstalling
pip uninstall unsloth -y
pip install --upgrade --force-reinstall --no-cache-dir git+https://github.com/unslothai/unsloth.git
The issue still exists in the new update I think. Did any one solve the problem? Thnx
It's possible the issue is in the engine you're using to run it. Not everything runs Llama 3 well. You can see in the GGUF I finetuned with unsloth: https://huggingface.co/yaystevek/llama-3-8b-Instruct-OpenHermes-2.5-QLoRA-GGUF
It works great through llama.cpp directly and other tools using it, but ollama for example still had infinite generation issues.
Hmm has Ollama updated their implementation?