maxtext
maxtext copied to clipboard
Data loader for Maxtext training emulator for storage
The standalone data loader, sets up the model and data iterator similar to the train_loop of train.py. The data loader iterates through batches of data, to log step time of the first step and time taken to load all the batches, but does not actually train the model.
The data loader mimics the interactions of the TPU VM hosts with the storage systems such as GCS.
I will add Github workflow to run standalone_dataloader.py separately.
Missing CLA is due to pulling in commits from main - ❌ https://github.com/google/maxtext/commit/89012c22a4ff3fc7e0ba5c0a3075df54038d18f2 Author: @a-googler <no****ly@google.com>