amulil comments

Results 14 comments of


                                            amulil

[Feature] how to open window attention in qwen-14B?

> LMDeploy hasn't support window attention yet. @lvhan028 Will Lmdeploy support window attention, It seems LongLora used window attention, If I want deploy a LongLora model.Can i use the Lmdeploy?

[Feature] how to open window attention in qwen-14B?

> lmdeploy hasn't supported window attention yet. I mean If I use a Longlora model, Can I use lmdeploy to deploy the model without using window attention?

Incorrect "RuntimeError: FlashAttention only support fp16 and bf16 data type"

meet the same problem @tridao do you have the recommended way to solve it?

关于ppo阶段，reward分数计算的问题

> [code](https://github.com/OpenLMLab/MOSS-RLHF/blob/main/ppo/ppo_datahelper.py#L201)为每个token位置计算GAE时，都需要使用对应位置的reward[t]，但是在penalized_rewards计算时，只有最后时刻有加reward，即：penalized_rewards[-1] += rewards[i]，而对于其它位置，penalized_rewards就只有KL惩罚了，那是否需要计及这些状态的reward呢 @ruizheng20 麻烦解答一下这个问题，我也有这个疑惑，为啥只在 penalized_rewards[-1] + rewards[i] 呢，代码里 rank_all 是 False，如果设置成 True，那不就相当于每个 token 都有一个对应的 reward，和只在 penalized_rewards[-1] 加 reward 不就冲突了吗？