maxtext
maxtext copied to clipboard
A simple, performant and scalable Jax LLM!
MaxText uses the environment variables JAX_COORDINATOR_IP, JAX_COORDINATOR_PORT, NNODES, and NODE_RANK for multi-system GPU training, but JAX_COORDINATOR_ADDRESS, a fixed port, JAX_PROCESS_COUNT, and a combination of several environment variables, and for multi-system...
I don't think https://us-python.pkg.dev/gce-ai-infra/maxtext-build-support-packages/simple/ is public.
Not GPipe. Run pipeline forward meanwhile backward.
Right now, data loading and loss computation assume one is only doing LM pretraining, but it'd be useful to support packed SFT style datasets (i.e. datasets with cleanly delineated prompt/completion...
Running autoformatting via `pyink` via the `code_style.sh` script causes a lot of files to be reformatted and introduces noise in the commits. Is it possible for `code_style.sh` to be run...
It ain't much, but it's honest work.