maxtext icon indicating copy to clipboard operation
maxtext copied to clipboard

Data loader for Maxtext training emulator for storage

Open RoshaniN opened this issue 1 year ago • 1 comments

The standalone data loader, sets up the model and data iterator similar to the train_loop of train.py. The data loader iterates through batches of data, to log step time of the first step and time taken to load all the batches, but does not actually train the model.

The data loader mimics the interactions of the TPU VM hosts with the storage systems such as GCS.

I will add Github workflow to run standalone_dataloader.py separately.

RoshaniN avatar Dec 08 '23 10:12 RoshaniN

Missing CLA is due to pulling in commits from main - ❌ https://github.com/google/maxtext/commit/89012c22a4ff3fc7e0ba5c0a3075df54038d18f2 Author: @a-googler <no****ly​@google.com>

RoshaniN avatar Jan 03 '24 16:01 RoshaniN