pppppM comments

Results 84 comments of


                                            pppppM

请问这个repo还在继续维护吗

您好，这个 repo 后面不维护了，预计 12 月会开源一个新的 repo，包含 mmdetection-distiller 和 mmsegmentation-distiller 的内容，以及更多的相关功能

shape mismatch when loading llava-phi path

The issue might be due to the local model not being initialized correctly. Before loading the checkpoint, check if the model contains the key `llm.model.layers.0.self_attn.o_proj.weight`.

warning on CUTLASS&sparse_attn&triton

It won't affect anything, this is a warning from the new version of PyTorch and Triton

多卡微调报错

应该是使用了 zero3? 如果使用了 zero3，请将 transformers 和 bitsandbytes 版本更新到最新 xtuner==0.1.19 transformers==4.40.2 bitsandbytes==0.43.1 如果不想更新版本，可以使用 zero1 或 zero2

多卡微调报错

可能是因为 v100 上用不了 flash attention，序列越长，和 4090 的显存差距就会越明显可以尝试用 zero3 + qlora 来降低显存，否则 llm 部分是没有被切片的，每个显卡上都会有 4bit llm 的显存占用

多卡微调报错

@Franklin-L 参考这个文档 https://xtuner--664.org.readthedocs.build/zh-cn/664/acceleration/deepspeed.html

权重载入

@zwhus 在 v0.1.19 版本中，我们将默认的保存格式改为了 bin，如果想保存 safetensor，可加添加 `--safe-serialization` https://github.com/InternLM/xtuner/pull/648

使用官方脚本对应数据集提示列名不匹配，使用同格式自定义数据集报错

@LumenScope 首先，是 map fn 的定义方式，mmengine 的 config 没有办法在 config 文件内定义新的函数，只能通过 import 的方式，具体见 https://github.com/InternLM/xtuner/tree/main/examples/demo_data/multi_turn_2#config 其次，对于自定义的数据集，可以通过 `xtuner check-custom-dataset $CONFIG` 检查格式哪里有错误最后，可以通过 `xtuner log-dataset $CONFIG` 来查看转换后的数据样式

合并llama3时出现如下报错，这个问题再使用zero3时也出现了

显存不够导致模型中有些参数是 meta tensor，在命令后加一下 `--device cpu`