Daniel Han

Results 781 comments of Daniel Han

Update: Hi so I managed to test HF -> llama.cpp without Unsloth to remove Unsloth from the picture. 1. '\n\n' is tokenized as [1734, 1734], unless if I prompted it...

It should be fixed!

Currently we do not support multi GPU - use our Kaggle Llama-3 notebook: https://www.kaggle.com/code/danielhanchen/kaggle-llama-3-8b-unsloth-notebook as is - it does not work on 2x T4s

Ye sadly not 128K yet - on the roadmap though! It's sadly not RoPE but some other scaling mechanism

Oh `adapter_config.json` is the `config.json` equivalent. If you're looking for Ooba inference or GGUF, please use our saving to 16bit instead