optimum-habana
optimum-habana copied to clipboard
add trl example for mistral/mixtral.
What does this PR do?
add more training examples for trl-dpo, like MoE structure mistralai/Mixtral-8x7B-v0.1, mistralai/Mistral-7B-v0.1
@lkk12014402 , pls 1) provide performance and convergence comparisons btw A100 and Gaudi2 2) pls add ci, thx.
@lkk12014402 , pls 1) provide performance and convergence comparisons btw A100 and Gaudi2 2) pls add ci, thx.
yes, I will add these soon.
validated on Gaudi2
-
mixtral-8*7B trl-sft with 8 cards gaudi2, deepspeed zero3
-
mixtral-8*7B trl-dpo with 8 cards gaudi2, deepspeed zero3
-
mistral-7B trl-sft with 8 cards gaudi2, deepspeed zero3
-
mistral-7B trl-dpo with 8 cards gaudi2, deepspeed zero3
hi @regisss can we add these examples as MoE model is popular?
@lkk12014402 how is the GPU performance comparison?
hi @libinta I will update these as soon as possible