hoshi-hiyouga

Results 294 comments of hoshi-hiyouga

ChatGLM 1代的 eos token id 应该是 130005 没错啊,2代才是 2

我这里看还是 130005, ![image](https://github.com/hiyouga/ChatGLM-Efficient-Tuning/assets/16256802/686b2909-7eb0-41d2-9d45-81a9833550ef)

Huggingface 的 Trainer API 中集成了上述所有逻辑,因此我们不需要在代码中额外定义。 Accelerate 是 Huggingface 的加速框架,集成了 Deepspeed 的功能

应该是某条超长的输入爆显存了,可以限制一下 cutoff_len

I recommend setting inner_lr to 1.0 and tuning the epsilon with different datasets. We empirically adopt values between 0.05 and 0.5 for epsilon.

There are two approaches to tackling this problem. We can take the mean-pooling on the BERT outputs to obtain the label representations. Alternatively, we can also take the first word...

Hi, I recommend using the Chinese pretrained models following Huggingface's tutorials [https://huggingface.co/bert-base-chinese](https://huggingface.co/bert-base-chinese)

Hi, we should use a single word to tokenize each label in DualCL. I conjecture that if the two-character Chinese label is encoded by two or more tokens, the DualCL...

> [ChatGLM-Efficient-Tuning](https://github.com/hiyouga/ChatGLM-Efficient-Tuning): 使用模型回答和预期回答的embedding 余弦相似度来对回答打分。并没有采用人工标注或者 GPT-4 标注。 我们采用了添加了 ValueHead 的 ChatGLM-6B 模型,取 EOS token 的输出作为 score 和 RLHF 中的 reward,没有涉及到余弦相似度。在 SFT 阶段,采用的是交叉熵损失函数,而非余弦相似度。RLHF 则使用 reward 和 per-token KL divergence 作为优化目标。训练奖励模型时,使用了 GPT-4 和...

给过了 https://github.com/hiyouga/LLaMA-Factory/blob/main/examples/fsdp_qlora/README.md