LMFlow icon indicating copy to clipboard operation
LMFlow copied to clipboard

[Feature] Iterative DPO

Open wheresmyhair opened this issue 7 months ago • 0 comments

With tons of experiments and tests, we finally support iterative dpo within a python script. Other useful features come alongside with iterative dpo:

  1. Multi instance vllm inference (using ray)
  2. Multi instance rm inference (also using ray)
  3. Sample from a lmflow dataset, train test split a lmflow dataset

wheresmyhair avatar Jul 19 '24 15:07 wheresmyhair