ColossalAI icon indicating copy to clipboard operation
ColossalAI copied to clipboard

[chat] fix value computing bug

Open zhang-yi-chi opened this issue 2 years ago • 0 comments

📌 Checklist before creating the PR

  • [x] I have created an issue for this PR for traceability
  • [x] The title follows the standard format: [doc/gemini/tensor/...]: A concise description
  • [ ] I have added relevant tags if possible for us to better distinguish different PRs

🚨 Issue number

fixed #3622

📝 What does this PR do?

https://github.com/hpcaitech/ColossalAI/blob/b0ce5a10326912961f0bc07cbbd250bab7b9c399/applications/Chat/coati/models/base/critic.py#L45-L50

We should use the generation part of the sequenses to compute value. So the indexing should be [-num_actions:] instead of [:-num_actions] and using action_mask.

💥 Checklist before requesting a review

  • [x] I have linked my PR to an issue (instruction)
  • [x] My issue clearly describes the problem/feature/proposal, with diagrams/charts/table/code if possible
  • [x] I have performed a self-review of my code
  • [ ] I have added thorough tests.
  • [ ] I have added docstrings for all the functions/methods I implemented

⭐️ Do you enjoy contributing to Colossal-AI?

  • [x] 🌝 Yes, I do.
  • [ ] 🌚 No, I don't.

Tell us more if you don't enjoy contributing to Colossal-AI.

zhang-yi-chi avatar Apr 22 '23 07:04 zhang-yi-chi