maxtext icon indicating copy to clipboard operation
maxtext copied to clipboard

A simple, performant and scalable Jax LLM!

Results 159 maxtext issues
Sort by recently updated
recently updated
newest added

Has anyone tried to train the newest models on MaxText. For instance Llama3 and Mistral v.0.3? It is a bit unclear to me how much work this might be to...

The llama_or_mistral_ckpt.py requires --base-model-path to be in local file system, whereas the --maxtext-model-path is GCS. It would be good to change the implmentation to use fsspec or tf GFile or...

feature request

This PR modifies the parameter conversion mixtral tests to go through `gcsfuse` instead of disk for lower VM disk usage

pull ready

experimental_proxy only has 2 commits on it since branching: ``` $ git log --no-merges origin/experimental_proxy ^origin/main commit 13f519e39e0d904e320c7d8a472161e4bcf03408 (HEAD -> avritt/noocdbt, origin/vivianrwu_experimental_proxy, origin/experimental_proxy, experimental_proxy) Author: Zhihao Shan Date: Mon Sep...

b/371572923 Tested on v4-128: https://cloudlogging.app.goo.gl/YqjMDsc27SxXHSLaA

pull ready

allow_split_physical_axes is only supported for device meshes atm but we also should support this for hybrid meshes. This is useful when we want to use FSDP across DCN and ICI...

pull ready