kathir issues

Repositories
Issues
Comments

Results 6 issues of


                                            kathir

TPU Pretraining

How to train the model using TPUs?

Loading data from disk partially

Training the tokenizer is memory intensive. It needs hundreds of GBs of RAM to train a tokenizer. What about using memmap to load only the required portion of the data...

How to train nanoGPT using TPU's?

How can I train nanoGPT using TPU's? Can I just modify the DDP targeting TPU VM's or need to make changes to the model to make it XLA compilable?

Multi host TPU training

How to train an LLM using EasyLM on multiple tpus in a distributed manner?

TPU Training

How to train a Llama using TPUs?

Code to train llama on tpus?

Where is the code to train llama on tpu's?