Mr-Potential comments

Results 4 comments of


                                            Mr-Potential

关于ppo阶段，reward分数计算的问题

[code](https://github.com/OpenLMLab/MOSS-RLHF/blob/main/ppo/ppo_datahelper.py#L201)为每个token位置计算GAE时，都需要使用对应位置的reward[t]，但是在penalized_rewards计算时，只有最后时刻有加reward，即：penalized_rewards[-1] += rewards[i]，而对于其它位置，penalized_rewards就只有KL惩罚了，那是否需要计及这些状态的reward呢

FAISS Index-Docstore Inconsistency Issue

I have also encountered this issue. You can resolve it by adding the specified code [here](https://github.com/mem0ai/mem0/blob/main/mem0/vector_stores/faiss.py#L311C9-L317C94). change: ``` if index_to_delete is not None: self.docstore.pop(vector_id, None) self.index_to_id.pop(index_to_delete, None) self._save() logger.info(f"Deleted vector...

Try to run qwen2vl in verl, raise KeyError: 'model.embed_tokens.weight' in qwen2vl_dtensor_weight_loader

Thank you for the reminder! It does help! In addition to the aforementioned name discrepancies, 'lm_head.weight' also needs to be revised. The complete revisions are as follows, for those who...

Try to run qwen2vl in verl, raise KeyError: 'model.embed_tokens.weight' in qwen2vl_dtensor_weight_loader

@ZSL98 Hi, may I ask if this part could be included in future updates? Thanks!