hoshi-hiyouga comments

Results 294 comments of


hoshi-hiyouga

加载chatGLM-6b模型，提示“Please specify `use_v2` argument while using ChatGLM2-6B.”

ChatGLM 1代的 eos token id 应该是 130005 没错啊，2代才是 2

加载chatGLM-6b模型，提示“Please specify `use_v2` argument while using ChatGLM2-6B.”

我这里看还是 130005, ![image](https://github.com/hiyouga/ChatGLM-Efficient-Tuning/assets/16256802/686b2909-7eb0-41d2-9d45-81a9833550ef)

accelerate分布式训练

Huggingface 的 Trainer API 中集成了上述所有逻辑，因此我们不需要在代码中额外定义。 Accelerate 是 Huggingface 的加速框架，集成了 Deepspeed 的功能

ZeRO3 LoRA微调Qwen1.5-32B-Chat内存一开始不高，后面突然爆内存

应该是某条超长的输入爆显存了，可以限制一下 cutoff_len

How to set supper Patameters?

I recommend setting inner_lr to 1.0 and tuning the epsilon with different datasets. We empirically adopt values between 0.05 and 0.5 for epsilon.

For the labels containing multiple words, How to take the mean-pooling?

There are two approaches to tackling this problem. We can take the mean-pooling on the BERT outputs to obtain the label representations. Alternatively, we can also take the first word...

about chinese dataset

Hi, I recommend using the Chinese pretrained models following Huggingface's tutorials [https://huggingface.co/bert-base-chinese](https://huggingface.co/bert-base-chinese)

Why did the Dual gradient collapse on my own Chinese dataset?

Hi, we should use a single word to tokenize each label in DualCL. I conjecture that if the two-character Chinese label is encoded by two or more tokens, the DualCL...

> [ChatGLM-Efficient-Tuning](https://github.com/hiyouga/ChatGLM-Efficient-Tuning): 使用模型回答和预期回答的embedding 余弦相似度来对回答打分。并没有采用人工标注或者 GPT-4 标注。我们采用了添加了 ValueHead 的 ChatGLM-6B 模型，取 EOS token 的输出作为 score 和 RLHF 中的 reward，没有涉及到余弦相似度。在 SFT 阶段，采用的是交叉熵损失函数，而非余弦相似度。RLHF 则使用 reward 和 per-token KL divergence 作为优化目标。训练奖励模型时，使用了 GPT-4 和...

ValueError: Cannot flatten integer dtype tensors

给过了 https://github.com/hiyouga/LLaMA-Factory/blob/main/examples/fsdp_qlora/README.md