optimum-habana icon indicating copy to clipboard operation
optimum-habana copied to clipboard

add trl example for mistral/mixtral.

Open lkk12014402 opened this issue 10 months ago • 5 comments

What does this PR do?

add more training examples for trl-dpo, like MoE structure mistralai/Mixtral-8x7B-v0.1, mistralai/Mistral-7B-v0.1

lkk12014402 avatar Apr 15 '24 03:04 lkk12014402

@lkk12014402 , pls 1) provide performance and convergence comparisons btw A100 and Gaudi2 2) pls add ci, thx.

yao-matrix avatar Apr 19 '24 01:04 yao-matrix

@lkk12014402 , pls 1) provide performance and convergence comparisons btw A100 and Gaudi2 2) pls add ci, thx.

yes, I will add these soon.

lkk12014402 avatar Apr 19 '24 02:04 lkk12014402

validated on Gaudi2

  • mixtral-8*7B trl-sft with 8 cards gaudi2, deepspeed zero3 image

  • mixtral-8*7B trl-dpo with 8 cards gaudi2, deepspeed zero3 image

  • mistral-7B trl-sft with 8 cards gaudi2, deepspeed zero3 image

  • mistral-7B trl-dpo with 8 cards gaudi2, deepspeed zero3

image

lkk12014402 avatar Apr 24 '24 12:04 lkk12014402

hi @regisss can we add these examples as MoE model is popular?

lkk12014402 avatar Apr 24 '24 13:04 lkk12014402

@lkk12014402 how is the GPU performance comparison?

hi @libinta I will update these as soon as possible

lkk12014402 avatar May 21 '24 15:05 lkk12014402