gpt-2
gpt-2 copied to clipboard
Training on TPU
How to train the GPT2-xl on TPU? And which TPU can be used to train? And what would be RAM size?
I'm not 100% sure, because I decided to ditch my TPU efforts before I got training working (TPUs ended up being way to expensive and during my dev work I was on a way too small VM so training was failing to do OOM errors on the VM), but I think if you put the following code before the tf.Session()
is created in train.py
it will connect to a TPU:
tpu_resolver = tf.distribute.cluster_resolver.TPUClusterResolver(tpu="<TPU NODE NAME HERE>")
tf.config.experimental_connect_to_cluster(tpu_resolver)
tf.tpu.experimental.initialize_tpu_system(tpu_resolver)