gpt-2 icon indicating copy to clipboard operation
gpt-2 copied to clipboard

Training on TPU

Open Dhanachandra opened this issue 2 years ago • 1 comments

How to train the GPT2-xl on TPU? And which TPU can be used to train? And what would be RAM size?

Dhanachandra avatar Jan 12 '22 07:01 Dhanachandra

I'm not 100% sure, because I decided to ditch my TPU efforts before I got training working (TPUs ended up being way to expensive and during my dev work I was on a way too small VM so training was failing to do OOM errors on the VM), but I think if you put the following code before the tf.Session() is created in train.py it will connect to a TPU:

tpu_resolver = tf.distribute.cluster_resolver.TPUClusterResolver(tpu="<TPU NODE NAME HERE>")
tf.config.experimental_connect_to_cluster(tpu_resolver)
tf.tpu.experimental.initialize_tpu_system(tpu_resolver)

Noah-Huppert avatar Jun 28 '22 04:06 Noah-Huppert