benchmarks icon indicating copy to clipboard operation
benchmarks copied to clipboard

data_flow_ops.RecordInput outperforms tf.data

Open zhao1157 opened this issue 5 years ago • 0 comments

@reedwm I did some performance tests of Resnet50 (also _v1.5 and v2) on Tesla T4 and V100 GPUs (1-8). I found the input pipeline made by data_flow_ops.RecordInput + data_flow_ops.StagingArea generally outperforms tf.data + multi_device_iterator_ops.MultiDeviceIterator and tf.data + data_flow_ops.StagingArea, where the first one is activated by setting --datasets_use_prefetch=False --use_datasets=False, the second one --datasets_use_prefetch=True --use_datasets=True, and the third one --datasets_use_prefetch=False --use_datasets=True. However, I found the models I had encountered so far all applied tf.data API in their input pipelines. Since the tests I did showed better performance using data_flow_ops.RecordInput rather than tf.data, how do you suggest which one we should use?

zhao1157 avatar Sep 10 '20 12:09 zhao1157