whcao comments

Results 53 comments of


                                            whcao

[Feature] change InternLM2 modeling to unified type

Hi @yinfan98 ! Thank you for your advice. In order to unify these names, it’s essential to ensure that the checkpoint being loaded before inference is also adjusted accordingly. Prior...

Visualize layer activations and weights to simplify the quantization process.

> Hi, @HIT-cwh Do we support the visualization of the weight values? Support for this feature is currently in development and will be progressively enhanced in the forthcoming iterations.

Visualize layer activations and weights to simplify the quantization process.

> May add user guide about the usage of this great tool. The commit that fixes the load ckpt bug has been split out. Please refer to [pr690](https://github.com/InternLM/lmdeploy/pull/690)

[Bug] baichuan2 7b-13b cannot be 8-bit weight quantized with different error stack.

Apologies for the inconvenience this issue may have caused. At present, the quantization of Baichuan2 through W8A8 is not supported. And related functions will be developed as soon as possible.

The sequence parallel is open when I don't use it.

> @HIT-cwh I use this config, just set batch_size=4. https://github.com/InternLM/xtuner/blob/193f614ffbb2463010808ebb2e689331a9c5e4f6/xtuner/configs/qwen/qwen1_5/qwen1_5_0_5b_chat/qwen1_5_0_5b_chat_qlora_alpaca_e3.py#L40C8-L40C8 Then I use the command `CUDA_VISIBLE_DEVICES=4,5,6,7 NPROC_PER_NODE=4 xtuner train qwen1_5_0_5b_chat_qlora_alpaca_e3` to train. > > Thanks for your tip, I didn't...

PROMPT_TEMPLATE.llama2_chat效果下降

需要麻烦你提供下面两个信息： 1. 您微调的是llama2 base 还是 llama2 chat 2. 训练阶段修改完对话模板后，评测阶段有没有对应地修改对话模板理论上其实不建议用qlora去微调base模型，因为qlora会冻结住embedding层，而base模型在预训练阶段又没见过对话模板中的token（例如llama3对话模板中的 `` ）。因此，模型用qlora微调后还是不认识对话模板中的token。建议用全量微调的方式训练base模型的对话能力，或基于chat模型用lora/qlora微调

PROMPT_TEMPLATE.llama2_chat效果下降

是因为对话模板中存在一些特殊token，例如 llama2 中的 `[INST]` 和 llama3 中的 ``。 llama2的词表里就有`[INST]`，同样llama3的词表里有``这个字符串。所以他们在做token化的时候，这些特殊token能被token化成一个特定的token_id。如果用llama3的对话模板训llama2，那么llama3对话模板中的特色字符，llama2的tokenizer是不认识的，导致性能下滑。因此，建议用llama3的对话模板训llama3 base或chat比较好。