marko1616 comments

Results 11 comments of


                                            marko1616

Qwen1.5-MOE-A2.7B训练问题

你需要使用预览版的transformers 请使用指令`pip install git+https://github.com/huggingface/transformers`

Galore 全参有监督微调 qwen1.5系列模型，均会出现RuntimeError: value cannot be converted to type at::Half without overflow

无法复现呢，你能给出具体的库版本吗？(或者试着更新库和仓库版本) 我使用的是`transformers==4.39.2` cmdline:`python src/train_bash.py --stage sft --do_train --model_name_or_path /path-to/qwen/ --dataset marko1616 --template qwen --finetuning_type full --optim adamw_8bit --use_galore --galore_layerwise --galore_target mlp,self_attn --galore_rank 256 --output_dir /path-to/save --overwrite_cache --overwrite_output_dir --per_device_train_batch_size 2 --gradient_accumulation_steps...

是否支持yuan2

Yes?

qwen训练完成后，适配器刷不出来，手动填地址也无法识别的问题。

你这个是合并模型的指令哦。并不是用于训练或者导出适配器的指令。 ## 一些概念: - 适配器：差不多就是使用附加网络在只训练附加网络的情况下训练模型的”附加网络“。 - Lora：适配器的一种。 - export_model.py：用于合并附加网络和主网络，自然不会导出”适配器“呢。导出的应该是类似于主网络的文件。 - train_bash.py：用于训练的脚本 ## 对于你的例子可能的解决方案使用如下指令训练： ```bash python src/train_bash.py \ --model_name_or_path /mnt/workspace/qwen/Qwen-7B-Chat \ --adapter_name_or_path /mnt/workspace/LLaMA-Factory/save \ --stage sft \ --do_train \ --dataset...

marko1616

Qwen1.5-MOE-A2.7B训练问题

Galore 全参有监督微调 qwen1.5系列模型，均会出现RuntimeError: value cannot be converted to type at::Half without overflow

是否支持yuan2

qwen训练完成后，适配器刷不出来，手动填地址也无法识别的问题。

Add template&support(tested)

Add template&support(tested)

Add template&support(tested)

Add template&support(tested)

Does LLaMA-Factory support AMD graphic cards?

Does LLaMA-Factory support AMD graphic cards?