flownet2-tf
flownet2-tf copied to clipboard
process killed during training
Thank you very much for your excellent work! I met some problem when i use the dataset to training the model;
the output message as:
/flownet2-tf$ python -m src.flownet_s.train
2017-10-31 11:24:27.425738: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. 2017-10-31 11:24:27.425771: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. 2017-10-31 11:24:27.425793: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. 2017-10-31 11:24:27.425811: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. 2017-10-31 11:24:27.425829: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX512F instructions, but these are available on your machine and could speed up CPU computations. 2017-10-31 11:24:27.425847: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. 2017-10-31 11:24:27.646726: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 0 with properties: name: Quadro P5000 major: 6 minor: 1 memoryClockRate (GHz) 1.7335 pciBusID 0000:65:00.0 Total memory: 15.89GiB Free memory: 15.47GiB 2017-10-31 11:24:27.646762: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0 2017-10-31 11:24:27.646780: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0: Y 2017-10-31 11:24:27.646793: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Quadro P5000, pci bus id: 0000:65:00.0) Killed
Although it generate a model, I thought the process of training was killed, and it not finished training the whole dataset.
I tried to reduce the BATCH_SIZE to 2, but the result was same, was killed.
Did you met this situation, and how do I fix it?
Wishing for your reply, and thank you very much for your contribution!