deita icon indicating copy to clipboard operation
deita copied to clipboard

reproduce mt-bench score

Open bpucla opened this issue 3 months ago • 1 comments

Dear Authors,

Thank you for you great work! I'm trying to reproduce the reported MT-Bench scores with the released code and data.

Trying to reproduce: DEITA-7B-v1.0 (6K) --> mt-bench: 7.22 DEITA-7B-v1.0-sft --> mt-bench: 7.32

Data I used: hkust-nlp/deita-6k-v0 hkust-nlp/deita-10k-v0

Code I used: https://github.com/hkust-nlp/deita/blob/main/examples/train/sft.sh

The scores for both 6k and 10k I got are around 7.06 (vs. 7.22, 7.32). The difference seems larger than regular SFT and MT-Bench eval variability.

Any suggestions to resolve the discrepancy would be appreciated.

Thanks!

bpucla avatar Mar 21 '24 07:03 bpucla