maxtext icon indicating copy to clipboard operation
maxtext copied to clipboard

Llama3

Open peregilk opened this issue 8 months ago • 1 comments

Has anyone tried to train the newest models on MaxText. For instance Llama3 and Mistral v.0.3?

It is a bit unclear to me how much work this might be to support these models here. Do you for instance have to implement GQA in Llama3.

If this is mainly updating the config files, it would be extremely helpful if there were working config files for the newest versions of Llama, Gemma and Mistral.

peregilk avatar Jun 04 '24 20:06 peregilk