sleepfin
sleepfin
I agree with xyc112233 There is a mistake in the following code: ``` java if (childLeft + childWidth + paddingRight > myWidth) { childLeft = paddingLeft; childTop += mVerticalSpacing +...
I have the same problem on tf-1.4. Is there any way to fix this ?
--batch_size==64 get better performance (Speedup=0.9443) as expected because there's more time for gradients and variables to transfer. But in high-performance hardware like NVIDIA-P100 which takes fewer time to compute forward...
I failed to upload my trace file for unknown reason. You can reproduce my results on similiar environments. I notice that jitter using batch_size=12 (around 10.0) is much larger than...
@cryptox31 @reedwm Look at this: I sleep 2 seconds after each session run and the performance is better. Code: ``` def benchmark_one_step(...): if image_producer is not None: image_producer.notify_image_consumption() train_time =...
BTW, In single node (1 GPU), the FPS is 42.67 and it takes 12 / 42.67 = 0.281 second for each step. In single node (4 GPUs), the FPS is...
I cannot upload my trace file but if someone can reproduce my results, you'll find that at the end of each step, RecvOP of variables from ps0 -> worker0 and...
@reedwm Can anyone help explain the difference between `sleep version` and `no sleep version` ?
TensorFlow-gfile连接OBS中断,并无法重连。 这个问题目前没有很好的解决办法。 请在11月15日的变更之后 尝试在代码最前面加入: ``` import moxing.tensorflow as mox mox.cache() ``` 让TensorFlow对ckpt和summary的读取和写入可以通过本地缓存的方式中转来解决。