benchmarks
benchmarks copied to clipboard
A benchmark framework for Tensorflow
https://github.com/tensorflow/benchmarks/blob/6c2ccb45049673f09fdea9406372d6561db5c4fd/scripts/tf_cnn_benchmarks/models/mobilenet_v2.py#L47 does it mean slim.conv2d use default data_format NHWC here ?
Turning on XLA (--xla_compile=True) in distributed mode causes failure during initialization: > 2019-01-08 01:10:13.755644: E tensorflow/core/distributed_runtime/master.cc:315] CreateSession failed because worker /job:worker/replica:0/task:1 returned error: Unavailable: OS Error > Additional GRPC error...
I'm trying to run tf_cnn_benchmark.py on Power9 machine. When i tried to run the benchmark with horovod using 1 GPU, it worked fine; When I tried to use 2 GPUs...
In preprocessing.py, Cifar10ImagePreprocessor class is using older dataset reader APIs (tf.train.shuffle_batch, etc.). These have been deprecated. This needs to be updated to use tf.Dataset APIs.
I am reading the benchmarks source code. The following piece of code is the part that creates TensorFlow dataset from TFRecord files: `ds = tf.data.TFRecordDataset.list_files(tfrecord_file_names)` `ds = ds.apply(interleave_ops.parallel_interleave(tf.data.TFRecordDataset, cycle_length=10))` I...
Notice codes [HERE](https://github.com/tensorflow/benchmarks/blob/master/scripts/tf_cnn_benchmarks/tf_cnn_benchmarks.py#L333) ``` biased = tf.reshape( tf.nn.bias_add( conv, biases, data_format=self.data_format), conv.get_shape()) ``` I think the output shape of bias_add is exactly the same as conv.get_shape(). So why bother to...
I am testing on AlexNet with 8 V100 PCIE GPUs using real imagenet dataset. The Result for one card is about 3700. two cards is 6400. three cards is 6000...
With the following code, `nvidia-smi nvlink -g 0 -i 0` report zero bytes transmitted/received. Same, if I kick off the benchmarks with `--all_reduce_spec=nccl --variable_update=replicated ` from tensorflow.contrib.nccl import all_sum with...
Hi, we are using benchmark script to test the performance of our GPUs, but we found that if we enable XLA/xla_compile, the throughput increases a lot. However, the total loss...
Hi, I followed this [article](https://medium.com/tensorflow/pushing-the-limits-of-gpu-performance-with-xla-53559db8e473) and reproduce the throughput that it made. However, when I try to compile the tensorflow by myself, I cannot achieve the throughput that article did....