Yuchen Han issues

Results 6 issues of


                                            Yuchen Han

No module named src

为什么我运行时会报这个错误呢？ Traceback (most recent call last): File "graph_gan.py", line 11, in from src import utils ModuleNotFoundError: No module named 'src'

Why use L2 regularization in reward model training?

Hello, respected developers of Open Assistant. @andreaskoepf While studying your reward model training code, I noticed that besides the ranking loss, there is an additional L2 regularization term. What is...

关于中文reward-model参数合并的问题

感谢作者无私开源，看到官方README里说中文的reward-model是基于open-chinese-llama-7b做的，但是后面的步骤说明里写的是：python merge_weight_zh.py recover --path_raw decapoda-research/llama-7b-hf --path_diff ./models/moss-rlhf-reward-model-7B-zh/diff --path_tuned ./models/moss-rlhf-reward-model-7B-zh/recover，这里用的又是原始的llama。我一开始是使用原始的llama合并的，可以成功合并，但是打分异常 #21 。后来尝试使用https://huggingface.co/openlmlab/open-chinese-llama-7b-patch 进行合并，但是代码中的一致性检查报错了Naive integrity check failed. This could imply that some of the checkpoint files are corrupted. @Ablustrund

关于Reward model打分的一些疑惑

感谢作者的无私开源，但是目前使用作者的Reward model打分时遇到一些问题，对于大部分问答，作者的reward model都会给负分，此外不同prompt对应的分数差别也很大，想请教一下是我的使用方法不对吗？以下是我的使用代码： ```python import torch from transformers import LlamaTokenizer from transformers.models.llama.modeling_llama import LlamaForCausalLM class LlamaRewardModel(LlamaForCausalLM): def __init__(self, config, tokenizer): super().__init__(config) self.tokenizer = tokenizer self.reward_head = torch.nn.Linear(config.hidden_size, 1, bias=False)...

参考文本是怎么获得的？GPT-4生成的吗？

Exploring the Combined Effects of YaRN and Adjusted rope_base Values in deepseek v2

In deepseek v2, static YaRN with rope_base=10000 was used, yielding excellent extrapolation results. Could the authors clarify whether they have attempted to set rope_base to 500000 while using YaRN, and...