advNLG
advNLG copied to clipboard
您好!感谢您精彩有趣的工作! 在看《JOINT GENERATOR-RANKER LEARNING FOR NATURAL LANGUAGE GENERATION》论文代码时,我发现一个问题:在更新generator计算reward时,reward由两部分相加得到,即reranker_rewards和metric_rewards,但是在将两部分相加前,似乎却没有归一化,self.args.normalize_rewards是False。这样reranker_rewards和metric_rewards便相差了几个数量级,但是论文得出了reranker_rewards更加重要的结论。 不知道我是否遗漏了什么? 感谢您的解惑! 以下代码位于:JGR/trainer_utils/trainer.py的compute_loss_generator函数中 ``` self.reward_tracker['reranker_rewards'].append(reranker_rewards.detach().cpu().numpy().tolist()) self.reward_tracker['metric_rewards'].append(metric_rewards.detach().cpu().numpy().tolist()) if self.args.normalize_rewards: # rererank_rewards_std = torch.std(reranker_rewards, dim=1, keepdim = True) metric_rewards_std = torch.std(metric_rewards, dim=1, keepdim =...
Hello: I found this GitHub link in the paper `Metric-guided Distillation: Distilling Knowledge from the Metric to Ranker and Retriever for Generative Commonsense Reasoning`, but it doesn't seem to be...