xlnet
xlnet copied to clipboard
TPU num_shards and num_replicas error
when I use 'TPU v3-32 'and 'tf 1.13' to train xlnet ,it tell me a error. How can I fix it!
Found TPU system:
tpu_system_metadata.py:121] *** Num TPU Cores: 8
tpu_system_metadata.py:122] *** Num TPU Workers: 1
tpu_system_metadata.py:124] *** Num TPU Cores Per Worker: 8
...
ValueError: TPUConfig.num_shards is not set correctly. According to TPU system metadata for Tensorflow master: num_replicas should be (8), got (32). For non-model-parallelism, num_replicas should be the total num of TPU cores in the system. For model-parallelism, the total number of TPU cores should be num_cores_per_replica * num_replicas. Please set it accordingly or leave it as None
I have the same issue as well.
you may refer to this one: https://github.com/zihangdai/xlnet/pull/239/files#diff-0