keras
keras copied to clipboard
Can get CUDA to work with torch but not with tensorflow
I am running Ubuntu 22.04.3 LTS. Python 3.10.12, GCC 11.4.0 The system has an NVIDIA GeForce RTX 3060 card, with Driver Version: 535.129.03; CUDA Version: 12.2
I installed keras-3.0.2 in two different virtual envs using as backends tensorflow-cuda and torch-cuda. In both cases I use the instructions given:
For tf-cuda
pip install -r requirements-tensorflow-cuda.txt python pip_build.py --install
For torch-cuda
pip install -r requirements-torch-cuda.txt python pip_build.py --install
When I test the torch-cuda, I get torch version = 2.1.1+cu118 which detects CUDA. So that version seems to be OK
But when I test the tf-cuda, I get
2023-12-30 07:34:38.417075: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2023-12-30 07:34:38.891617: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
tf.test.gpu_device_name() #should get a name but return is empty tf.config.list_physical_devices(‘GPU’) # returns [ ]
Any help/suggestions to fix this issue are greatly appreciated. Thank you.
@JuanVargas - Can you try in a new virtual env -
pip install torch # also install Cuda 12
pip install tensorflow # will download TF 2.15 that should use the same cuda 12
# pip install jax (if needed)
Then test TensorFlow?
Hi Ranesh
I tried the steps you suggest under a new virtual env. As you said, the version of tf/keras installed is 2.15.0. The torch is 2.1.2+cu121, both of which were able to recognize the GPU in the system.
So it looks like tf/keras v 2.16 still needs some work. Thank you!
Juan
On Sun, Dec 31, 2023 at 2:48 AM Ramesh Sampath @.***> wrote:
@JuanVargas https://github.com/JuanVargas - Can you try in a new virtual env -
pip install torch # also install Cuda 12 pip install tensorflow # will download TF 2.15 that should use the same cuda 12
pip install jax (if needed)
Then test TensorFlow?
— Reply to this email directly, view it on GitHub https://github.com/keras-team/keras/issues/19002#issuecomment-1872800600, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGK34PTYER5DHTKXEL2TADYMEKDPAVCNFSM6AAAAABBHQISG6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZSHAYDANRQGA . You are receiving this because you were mentioned.Message ID: @.***>
Hi @JuanVargas ,
I think the problem is due to the tf-nightly version pinned in requirements-tensorflow-cuda.txt for Keras 3.0.2 version.
https://github.com/keras-team/keras/blob/fe2f54aa5bc42fb23a96449cf90434ab9bb6a2cd/requirements-tensorflow-cuda.txt#L3
The above tf-nightly version shall be called by Keras version but it is failing to install CUDA package with the below error log.
ERROR: Could not find a version that satisfies the requirement tensorrt-libs==8.6.1; extra == "and-cuda" (from tf-nightly[and-cuda]) (from versions: 9.0.0.post11.dev1, 9.0.0.post12.dev1, 9.0.1.post11.dev4, 9.0.1.post12.dev4, 9.1.0.post11.dev4, 9.1.0.post12.dev4, 9.2.0.post11.dev5, 9.2.0.post12.dev5)
ERROR: No matching distribution found for tensorrt-libs==8.6.1; extra == "and-cuda"
This tf-nightly version already updated with latest working version in Keras Master. https://github.com/keras-team/keras/blob/ccc202a94bbcf02023b6d32ef05a4326eced6e69/requirements-tensorflow-cuda.txt#L3
If you try installing keras master may be this problem will not occur.
Thanks!
It looks like this is the problem. Thank you so much for your feedback. Juan
On Tue, Jan 2, 2024 at 1:19 AM Surya @.***> wrote:
Hi @JuanVargas https://github.com/JuanVargas ,
I think the problem is due to the tf-nightly version pinned in requirements-tensorflow-cuda.txt for Keras 3.0.2 version.
https://github.com/keras-team/keras/blob/fe2f54aa5bc42fb23a96449cf90434ab9bb6a2cd/requirements-tensorflow-cuda.txt#L3
The above tf-nightly version shall be called by Keras version but it is failing to install CUDA package with the below error log.
ERROR: Could not find a version that satisfies the requirement tensorrt-libs==8.6.1; extra == "and-cuda" (from tf-nightly[and-cuda]) (from versions: 9.0.0.post11.dev1, 9.0.0.post12.dev1, 9.0.1.post11.dev4, 9.0.1.post12.dev4, 9.1.0.post11.dev4, 9.1.0.post12.dev4, 9.2.0.post11.dev5, 9.2.0.post12.dev5) ERROR: No matching distribution found for tensorrt-libs==8.6.1; extra == "and-cuda"
This tf-nightly version already updated with latest working version in Keras Master.
https://github.com/keras-team/keras/blob/ccc202a94bbcf02023b6d32ef05a4326eced6e69/requirements-tensorflow-cuda.txt#L3
If you try installing keras master may be this problem will not occur.
Thanks!
— Reply to this email directly, view it on GitHub https://github.com/keras-team/keras/issues/19002#issuecomment-1873654698, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGK34KVS7S5UEVKAVOMJSTYMORH3AVCNFSM6AAAAABBHQISG6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZTGY2TINRZHA . You are receiving this because you were mentioned.Message ID: @.***>
Hi @JuanVargas ,
Thanks for confirmation.
@sampathweb ,
Whether we need to cherry pick the above change(in master branch) to 3.0.2 ?
Yes, I think that would nice and would help. The version of Tf/keras that recognizes the gpus is 2.15. I was hoping to get gpus work with keras 3.
On Tue, Jan 9, 2024, 00:25 Surya @.***> wrote:
Hi @JuanVargas https://github.com/JuanVargas ,
Thanks for confirmation.
@sampathweb https://github.com/sampathweb ,
Whether we need to cherry pick the above change https://github.com/keras-team/keras/issues/19002#issuecomment-1873654698(in master branch) to 3.0.2 ?
— Reply to this email directly, view it on GitHub https://github.com/keras-team/keras/issues/19002#issuecomment-1882433279, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGK34JW5MCQ6AESDIYOO43YNTIFPAVCNFSM6AAAAABBHQISG6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBSGQZTGMRXHE . You are receiving this because you were mentioned.Message ID: @.***>
Hi
in a previous comment you suggest : " ... If you try installing keras master may be this problem will not occur...."
Could you please let me know how I may do that, so that I could use the GPUs and the CUDA API under version 2.16 ?
I found that all I needed was to read more carefully your suggestion and edit the file requirements-tensorflow-cuda.txt to replace version. So I created a new virtual env and did the following :
edited requirements-tensorflow-cuda.txt to use 2.16.0-dev20240101
pip install -r requirements-tensorflow-cuda.txt
run python import tensorflow as tf
tf.version 2024-01-16 17:59:46.686812: tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2024-01-16 17:59:47.235591: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
tf.version '2.16.0-dev20240101' tf.config.list_physical_devices('GPU')** 2024-01-16 18:03:08.205240:
returns empy list [ ]
2024-01-16 18:03:08.246976: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2256] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform. Skipping registering GPU devices...
@JuanVargas Hi, I also meet the same problem. Do you solve it?
I could not solve the problem. I had to go back to version 2.15 :-(
On Wed, Jan 24, 2024 at 5:07 AM ling luo @.***> wrote:
@JuanVargas https://github.com/JuanVargas Hi, I also meet the same problem. Do you solve it?
— Reply to this email directly, view it on GitHub https://github.com/keras-team/keras/issues/19002#issuecomment-1907807290, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGK34OHLMKRVCFAVHVIFDDYQDMM7AVCNFSM6AAAAABBHQISG6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMBXHAYDOMRZGA . You are receiving this because you were mentioned.Message ID: @.***>