Shiyue Xu comments

Results 7 comments of


                                            Shiyue Xu

Question about an error when using mixed-precision-training on V100

> Hi @AIR-hl ! You cannot perform pure fp16 training as it is not supported by pytorch. In order to do mixed precision fp16 training you should either load the...

SFT 时出现 StopIteration

> datasets 库是否最新，数据集总量是否达到至少 50 条 `datasets`是最新版的`2.19.1`，数据量在80k+

> max_samples: 10000000 添加后 StopIteration不出现了，但出现了 `ValueError: Target modules {'q_proj', 'v_proj'} not found in the base model. Please check the target modules and try again.` ![image](https://github.com/hiyouga/LLaMA-Factory/assets/66808901/8b9e9576-2b8d-4ab5-b318-de4900f2ec1d) 由`phi-1.5`换为`phi-2`后依然如此，`readme.md`中确实是`q_proj, v_proj` ![image](https://github.com/hiyouga/LLaMA-Factory/assets/66808901/1596f53e-4832-4ca5-ba50-b7b69e906864) ---- 还有请问执行训练命令后终端中打印出了一条数据的`input_ids`,...

SFT 时出现 StopIteration

> 正常你可能需要升级 phi 模型文件，或者改成 lora_targets: all 十分感谢！麻烦再问您一下升级模型文件是指什么？现在用的是官方的提供的`microsoft/phi-2`和`microsoft/phi-1_5`。真的十分感谢！

SFT 时出现 StopIteration

> 建议用改 lora_target 的方法感谢！修改lora_target可行，但接着又会会出现 `ValueError: PhiForCausalLM does not support gradient checkpointing.` `ValueError: PhiForCausalLM does not support Flash Attention 2.0 yet.` 环境都是最新的，我先试试用其他模型吧！大佬注意休息！

SFT 时出现 StopIteration

> flash_attn: auto gradient_checkpointing: false 想额外提个小问题，我在使用命令行启动训练lora模型时，保存的路径是自定义的，但我使用 webui 的`Chat`时想加载某个 checkpoint 的 adapter时，无法使用自定义路径，因为它会在路径前面默认加上一段路径，只能将保存了各个 checkpoint 的文件夹得路径改成指定的`Gemma/lora`，希望可以调整一下逻辑，去除掉这个默认路径，或者是单纯改为`saves` ![image](https://github.com/hiyouga/LLaMA-Factory/assets/66808901/5582a8c1-d094-4fc3-9435-111f71605b07) ![image](https://github.com/hiyouga/LLaMA-Factory/assets/66808901/010c2b90-3ea6-41b2-9737-401fb0445a18)

Using PEFT causes model to not predict EOS

> ### System Info > > peft version: 0.9.0 > > accelerate version: 0.27.2 > > transformers version: 4.37.0 > > trl version: 0.7.12.dev0 > > base model: openai-community/gpt2 >...