cloud
cloud copied to clipboard
Tensorflow Cloud Tutorial failed to train model
When using the TensorFlow Cloud Tutorial https://www.tensorflow.org/cloud/tutorials/overview , the model fail to train on GCP. I've added the log file and a screenshot of the error . I'm a google employee and ldap is alankho - happy to share my GCP project with whoever is working on the bug.
The failure happened in Google AI Platform when training the model after about 5 hours
I can't tell if AI Platform is using the GPUs properly because when training, the GPU page show that the utilization is 0%, where as the CPU page shows close to 80%. I've uploaded screenshots for the team.
Hi my name is Aarushi Soni . I want to contribute to this issue . Is this issue still open ? I am first time contributor . Please guide me through this process.