Yuchen Han

Results 6 issues of Yuchen Han

为什么我运行时会报这个错误呢? Traceback (most recent call last): File "graph_gan.py", line 11, in from src import utils ModuleNotFoundError: No module named 'src'

Hello, respected developers of Open Assistant. @andreaskoepf While studying your reward model training code, I noticed that besides the ranking loss, there is an additional L2 regularization term. What is...

感谢作者无私开源,看到官方README里说中文的reward-model是基于open-chinese-llama-7b做的,但是后面的步骤说明里写的是:python merge_weight_zh.py recover --path_raw decapoda-research/llama-7b-hf --path_diff ./models/moss-rlhf-reward-model-7B-zh/diff --path_tuned ./models/moss-rlhf-reward-model-7B-zh/recover,这里用的又是原始的llama。我一开始是使用原始的llama合并的,可以成功合并,但是打分异常 #21 。后来尝试使用https://huggingface.co/openlmlab/open-chinese-llama-7b-patch 进行合并,但是代码中的一致性检查报错了Naive integrity check failed. This could imply that some of the checkpoint files are corrupted. @Ablustrund

感谢作者的无私开源,但是目前使用作者的Reward model打分时遇到一些问题,对于大部分问答,作者的reward model都会给负分,此外不同prompt对应的分数差别也很大,想请教一下是我的使用方法不对吗? 以下是我的使用代码: ```python import torch from transformers import LlamaTokenizer from transformers.models.llama.modeling_llama import LlamaForCausalLM class LlamaRewardModel(LlamaForCausalLM): def __init__(self, config, tokenizer): super().__init__(config) self.tokenizer = tokenizer self.reward_head = torch.nn.Linear(config.hidden_size, 1, bias=False)...

In deepseek v2, static YaRN with rope_base=10000 was used, yielding excellent extrapolation results. Could the authors clarify whether they have attempted to set rope_base to 500000 while using YaRN, and...