hoshi-hiyouga
hoshi-hiyouga
ping @mlinmg
> No problem, I'll do it tomorrow. Alse please send the error it gives you, I can't manage to reproduce it script: ``` CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \ --stage sft \...
too many dummy commit in this pr, open another one should be more efficient btw, qwen1.5 models do not work for me in both non-FA2 and FA2 paths
It may be because the embed tokens and lm head are saved in 32-bit precision, leading to an increase in the file size. You can merge the lora adapter using...
更新代码,选择导出设备为 cuda
in 3 days
now we provide the `empty` template that allows training on the text without a specific format, you can pre-process your dataset and template them, and use `empty` template to fine-tune...
sorry it is not supported yet
Llama-3 model can generate texts in our Linux environment. I think it is likely an issue with your hardware and environment. ![image](https://github.com/hiyouga/LLaMA-Factory/assets/16256802/fa5f2dde-1fc2-45c0-84e5-cf1cfe89f736)
@JieShenAI 还没支持。