benchmarks issues

train 'mobilenet_v2' with data_format 'NCHW' , is it right here?

2

https://github.com/tensorflow/benchmarks/blob/6c2ccb45049673f09fdea9406372d6561db5c4fd/scripts/tf_cnn_benchmarks/models/mobilenet_v2.py#L47 does it mean slim.conv2d use default data_format NHWC here ?

zymazloa

variable_update=parameter_server fails with XLA in distributed mode

4

Turning on XLA (--xla_compile=True) in distributed mode causes failure during initialization: > 2019-01-08 01:10:13.755644: E tensorflow/core/distributed_runtime/master.cc:315] CreateSession failed because worker /job:worker/replica:0/task:1 returned error: Unavailable: OS Error > Additional GRPC error...

mrmhodak

ncclCommInitRank error for 2 GPUs from a single machine

I'm trying to run tf_cnn_benchmark.py on Power9 machine. When i tried to run the benchmark with horovod using 1 GPU, it worked fine; When I tried to use 2 GPUs...

jw447

Cifar10ImagePreprocessor uses deprecated input pipeline APIs

1

In preprocessing.py, Cifar10ImagePreprocessor class is using older dataset reader APIs (tf.train.shuffle_batch, etc.). These have been deprecated. This needs to be updated to use tf.Dataset APIs.

chandanjc

How to use parallel_interleave in TensorFlow

2

I am reading the benchmarks source code. The following piece of code is the part that creates TensorFlow dataset from TFRecord files: `ds = tf.data.TFRecordDataset.list_files(tfrecord_file_names)` `ds = ds.apply(interleave_ops.parallel_interleave(tf.data.TFRecordDataset, cycle_length=10))` I...

renganxu

Why reshape after bias_add ?

5

Notice codes [HERE](https://github.com/tensorflow/benchmarks/blob/master/scripts/tf_cnn_benchmarks/tf_cnn_benchmarks.py#L333) ``` biased = tf.reshape( tf.nn.bias_add( conv, biases, data_format=self.data_format), conv.get_shape()) ``` I think the output shape of bias_add is exactly the same as conv.get_shape(). So why bother to...

sleepfin

how to use chrome://tracing/ to analyse performance bottleneck with --trace_file flas.

I am testing on AlexNet with 8 V100 PCIE GPUs using real imagenet dataset. The Result for one card is about 3700. two cards is 6400. three cards is 6000...

ghost

Don't see any transfers on NVLINK with NCCL all_sum on p3.8xlarge

With the following code, `nvidia-smi nvlink -g 0 -i 0` report zero bytes transmitted/received. Same, if I kick off the benchmarks with `--all_reduce_spec=nccl --variable_update=replicated ` from tensorflow.contrib.nccl import all_sum with...

aurotripathy

Total loss becomes Nan while using XLA

6

Hi, we are using benchmark script to test the performance of our GPUs, but we found that if we enable XLA/xla_compile, the throughput increases a lot. However, the total loss...

chenhuan0

Compiling options for tensorflow

1

Hi, I followed this [article](https://medium.com/tensorflow/pushing-the-limits-of-gpu-performance-with-xla-53559db8e473) and reproduce the throughput that it made. However, when I try to compile the tensorflow by myself, I cannot achieve the throughput that article did....

chenhuan0

benchmarks
benchmarks copied to clipboard

Metadata

train 'mobilenet_v2' with data_format 'NCHW' , is it right here?

variable_update=parameter_server fails with XLA in distributed mode

ncclCommInitRank error for 2 GPUs from a single machine

Cifar10ImagePreprocessor uses deprecated input pipeline APIs

How to use parallel_interleave in TensorFlow

Why reshape after bias_add ?

how to use chrome://tracing/ to analyse performance bottleneck with --trace_file flas.

Don't see any transfers on NVLINK with NCCL all_sum on p3.8xlarge

Total loss becomes Nan while using XLA

Compiling options for tensorflow

← Metadata

Owner

Metadata

benchmarks benchmarks copied to clipboard

Metadata

← Metadata

Owner

Metadata

benchmarks
benchmarks copied to clipboard