ColossalAI icon indicating copy to clipboard operation
ColossalAI copied to clipboard

[ColossalChat] Update RLHF V2

Open YeAnbang opened this issue 1 year ago • 3 comments

📌 Checklist before creating the PR

  • [ ] I have created an issue for this PR for traceability
  • [ ] The title follows the standard format: [doc/gemini/tensor/...]: A concise description
  • [ ] I have added relevant tags if possible for us to better distinguish different PRs

🚨 Issue number

Link this PR to your issue with words like fixed to automatically close the linked issue upon merge

e.g. fixed #1234, closed #1234, resolved #1234

📝 What does this PR do?

Summarize your work here. if you have any plots/diagrams/screenshots/tables, please attach them here.

💥 Checklist before requesting a review

  • [ ] I have linked my PR to an issue (instruction)
  • [ ] My issue clearly describes the problem/feature/proposal, with diagrams/charts/table/code if possible
  • [x] I have performed a self-review of my code
  • [x] I have added thorough tests.
  • [x] I have added docstrings for all the functions/methods I implemented

⭐️ Do you enjoy contributing to Colossal-AI?

  • [ ] 🌝 Yes, I do.
  • [ ] 🌚 No, I don't.

Tell us more if you don't enjoy contributing to Colossal-AI.

YeAnbang avatar Jan 19 '24 08:01 YeAnbang

Hi @YeAnbang , is this ready to review? Should I review https://github.com/hpcaitech/ColossalAI/pull/5148 this first?

TongLi3701 avatar Jan 21 '24 12:01 TongLi3701

Is ready for review now. Still working on fixing the following bug:

  • All script hang during booster.save_model with Gemini.

YeAnbang avatar Jan 22 '24 09:01 YeAnbang

Local CI Test Results

ci.pdf

YeAnbang avatar Feb 02 '24 10:02 YeAnbang

all tests passed locally.

image

YeAnbang avatar Mar 22 '24 10:03 YeAnbang

Experiment Report

SFT

image

DPO

img_v3_029a_1201f32f-cdba-40e7-923e-d8ecc0f2d66g

PPO

img_v3_029a_a70dba2a-7d80-4ee1-935f-67f5a47d17fg 5FrWy9zwTr

YeAnbang avatar Mar 25 '24 03:03 YeAnbang