kathir

Results 6 issues of kathir

How to train the model using TPUs?

Training the tokenizer is memory intensive. It needs hundreds of GBs of RAM to train a tokenizer. What about using memmap to load only the required portion of the data...

How can I train nanoGPT using TPU's? Can I just modify the DDP targeting TPU VM's or need to make changes to the model to make it XLA compilable?

How to train an LLM using EasyLM on multiple tpus in a distributed manner?

How to train a Llama using TPUs?

Where is the code to train llama on tpu's?