Reed

Results 116 comments of Reed

I can only comment on the log message > 2022-09-28 17:14:40.865705: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been...

@ezhulenev, can you take a look? I'm not very familiar with the grappler passes this PR modifies.

@pearu, are the XLA PRs that address the inaccuracies ready for review? That is, are the following PRs ready for review? * https://github.com/openxla/xla/pull/8589 * https://github.com/openxla/xla/pull/10376 * https://github.com/openxla/xla/pull/10503 * https://github.com/openxla/xla/pull/10525

The issue is TensorFlow leaves some memory unallocated for CUDA libraries to use: https://github.com/tensorflow/tensorflow/blob/6f788cd037e7ba3210f835f753b8e1dc7de22df2/tensorflow/core/common_runtime/gpu/gpu_device.cc#L1055-L1073 On Ampere GPUs like the rtx3060ti and rtx3050ti mentioned in this post, 1536MiB is left unallocated....

This question is better asked on [StackOverflow](http://stackoverflow.com/questions/tagged/tensorflow) since it is not a bug or feature request. There is also a larger community that reads questions there. Thanks!

Thank you @AndreasMadsen for the short script to reproduce and for removing the `transformers` dependency to keep the example simpler. I can reproduce the issue outside docker with CUDA 11.2.0...

Unfortunately I had to rollback this PR since it broke a oneDNN build on arm64. The error was: ``` In file included from tensorflow/core/kernels/mkl/mkl_einsum_op.cc:19: In file included from tensorflow/core/kernels/mkl/mkl_batch_matmul_helper.h:25: ./tensorflow/core/kernels/mkl/mkl_matmul_ops_common.h:927:38:...

> @cantonios - Is it ok to close this PR since the changes are already merged into master? Yes, this was merged in 21e9d7292d90db23805585e6f1846693692c0b83. I'm not sure why this PR...