2022-09-17 23:49:02.600018: I tensorflow/stream_executor/gpu/asm_compiler.cc:323] ptxas warning : Registers are spilled to local memory in function 'fusion_24', 8 bytes spill stores, 16 bytes spill loads ptxas warning : Registers are spilled to local memory in function '__internal_trig_reduction_slowpathd', 4 bytes spill stores, 4 bytes spill loads
Click to expand!
Issue Type
Bug
Source
source
Tensorflow Version
2.9.2
Custom Code
Yes
OS Platform and Distribution
No response
Mobile device
No response
Python version
No response
Bazel version
No response
GCC/Compiler version
No response
CUDA/cuDNN version
No response
GPU model and memory
No response
Current Behaviour?
Trying to run deepxde with tensorflow (TF2) backend.
I think this is related to
https://github.com/tensorflow/tensorflow/issues/33375
This question partly relates to the answer provided by there
In that issue, the following answer is given
> Hi @kleyersoma. The workaround for this particular problem on unix-based machines is to link your cuda bin to your working directory. Go to the directory, where you launch your python code and create the link:
> `ln -s /full/path/to/your/cuda/installation/bin .`
> This sovles the problem. The point is that TF first tries to load the ptxas from ./bin directory, then from /usr/local/cuda/bin. Unfortunately, it completely ignores the environment variables (which I consider to be a bug).
It is not clear to me what "the directory, where you launch your python code" refers to
I am running Ubuntu 22.04, and
which python3
gives
/usr/bin/python3
Is this the directory you're referring to?
If so, in my case I would do
ln -s /usr/local/cuda/bin /usr/bin/python3
Is that correct? Thanks.
Standalone code to reproduce the issue
https://colab.research.google.com/drive/1rYD_GMLAWJ6uTx76RfYvLWXB2nf9NP7Q?usp=sharing
Relevant log output
U instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-09-17 23:48:43.139452: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:42] Overriding orig_value setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2022-09-17 23:48:43.139501: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10126 MB memory: -> device: 0, name: NVIDIA GeForce GTX 1080 Ti, pci bus id: 0000:03:00.0, compute capability: 6.1
mexclusions
[]
Compiling model...
'compile' took 0.000393 s
Warning: epochs is deprecated and will be removed in a future version. Use iterations instead.
Training model...
2022-09-17 23:48:46.762887: I tensorflow/compiler/xla/service/service.cc:170] XLA service 0x560ea9633610 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2022-09-17 23:48:46.762922: I tensorflow/compiler/xla/service/service.cc:178] StreamExecutor device (0): NVIDIA GeForce GTX 1080 Ti, Compute Capability 6.1
2022-09-17 23:48:46.842763: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:263] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2022-09-17 23:48:51.944301: I tensorflow/stream_executor/gpu/asm_compiler.cc:323] ptxas warning : Registers are spilled to local memory in function 'input_fusion_reduce_4', 33468 bytes spill stores, 38624 bytes spill loads
ptxas warning : Registers are spilled to local memory in function '__internal_accurate_pow', 132 bytes spill stores, 132 bytes spill loads
ptxas warning : Registers are spilled to local memory in function '__internal_trig_reduction_slowpathd', 56 bytes spill stores, 48 bytes spill loads
2022-09-17 23:48:51.953246: I tensorflow/compiler/jit/xla_compilation_cache.cc:478] Compiled cluster using XLA! This line is logged at most once for the lifetime of the process.
2022-09-17 23:48:57.307073: I tensorflow/stream_executor/gpu/asm_compiler.cc:323] ptxas warning : Registers are spilled to local memory in function 'input_fusion_reduce_4', 33468 bytes spill stores, 38624 bytes spill loads
ptxas warning : Registers are spilled to local memory in function '__internal_accurate_pow', 132 bytes spill stores, 132 bytes spill loads
ptxas warning : Registers are spilled to local memory in function '__internal_trig_reduction_slowpathd', 56 bytes spill stores, 48 bytes spill loads
Step Train loss Test loss Test metric
0 [1.98e+03, 3.67e+02, 3.39e-02, 2.04e-01, 3.69e-02, 2.30e-01, 9.99e-02, 2.29e-01, 7.95e-02] [1.98e+03, 3.67e+02, 3.39e-02, 2.04e-01, 3.69e-02, 2.30e-01, 9.99e-02, 2.29e-01, 7.95e-02] []
2022-09-17 23:49:02.600018: I tensorflow/stream_executor/gpu/asm_compiler.cc:323] ptxas warning : Registers are spilled to local memory in function 'fusion_24', 8 bytes spill stores, 16 bytes spill loads
ptxas warning : Registers are spilled to local memory in function '__internal_trig_reduction_slowpathd', 4 bytes spill stores, 4 bytes spill loads
@gadagashwini, I was able to reproduce the issue on tensorflow v2.8 and nightly. Kindly find the gist of it here.
/usr/bin/python3 Is this the directory you're referring to? If so, in my case I would do ln -s /usr/local/cuda/bin /usr/bin/python3 Is that correct? Thanks
Lets say you have your python code (.py file) in specific directory, you go to that directory and the run the command below.
Go to the directory, where you launch your python code and create the link:
ln -s /full/path/to/your/cuda/installation/bin .
Hope this helps!!
This issue has been automatically marked as stale because it has no recent activity. It will be closed if no further activity occurs. Thank you.
Closing as stale. Please reopen if you'd like to work on this further.