maxtext issues

fix dtype bug in adam_pax

# [[Bug] adam_pax has reuse donated buffer warning](https://github.com/google/maxtext/issues/490) Reproduced with `weight_dtype=bfloat16` ```shell python3 MaxText/train.py MaxText/configs/base.yml run_name=run steps=10 weight_dtype=bfloat16 opt_type=adam_pax dataset_type=synthetic enable_checkpointing=false ``` ``` /home/lizhiyu/.local/lib/python3.10/site-packages/jax/_src/interpreters/mlir.py:914: UserWarning: Some donated buffers were not...

ZhiyuLi-goog

[Bug] adam_pax has reuse donated buffer warning

5

Hi, I noticed that when using `adam_pax` instead of `adamw` as optimizer, it will give `reuse donated buffer` warning. I am wondering if this is expected, and why the code...

LeoXinhaoLee

add compatibility w orbax's new API and add an opttion to restore wit…

Changing checkpointing to use the new API and replacing `default` with `items`. Also added an option to restore checkpoints with SingleReplicaArrayHandler

ssusie

TFDS Data Processing Pipline

5

Hi, I'm trying to understand some details in the TFDS data processing pipeline in your repo, and I'm confused about the following details: **In `_tfds_data_processing.py`:** (1) The `truncate_to_max_allowable_length` function truncates...

LeoXinhaoLee

Full JetEngine Support

rwitten

[Question] are there some some train replication results?

2

Hi, Thanks for the library! I'm new to the JAX+LLM ecosystem and trying to understand which library I should be using. I see a lot of (very impressive) computational efficiency...

YannDubs