keras-nlp icon indicating copy to clipboard operation
keras-nlp copied to clipboard

In colab, keras-nlp doesn't detect GPU anymore

Open joaopauloschuler opened this issue 2 years ago • 8 comments

Hello, I've just created a bug report at colab but I suspect that the problem might be at the Keras NLP side. Therefore, I'll replicate it here:

https://github.com/googlecolab/colabtools/issues/4153

Hello!

Current behavior In google colab, after importing keras-nlp, access to GPU is lost:

!pip install keras-nlp
import tensorflow
print(tensorflow.test.gpu_device_name())

The output of tensorflow.test.gpu_device_name() is an empty string.

At this point in time, it seems impossible to use GPU with Keras NLP.

!nvidia-smi does have output.

Sat Nov 18 02:47:16 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.105.17   Driver Version: 525.105.17   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   50C    P8    10W /  70W |      3MiB / 15360MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+ 

Expected behavior The output of tensorflow.test.gpu_device_name() should be something like /device:GPU:0.

May the force be with you, JP.

joaopauloschuler avatar Nov 18 '23 03:11 joaopauloschuler

It seems that we have a new behavior when installing keras_nlp:

!pip install keras-nlp==0.6.3
Collecting keras-nlp==0.6.3
  Downloading keras_nlp-0.6.3-py3-none-any.whl (584 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 584.5/584.5 kB 10.1 MB/s eta 0:00:00
Collecting keras-core (from keras-nlp==0.6.3)
  Downloading keras_core-0.1.7-py3-none-any.whl (950 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 950.8/950.8 kB 56.5 MB/s eta 0:00:00
Requirement already satisfied: absl-py in /usr/local/lib/python3.10/dist-packages (from keras-nlp==0.6.3) (1.4.0)
Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from keras-nlp==0.6.3) (1.23.5)
Requirement already satisfied: packaging in /usr/local/lib/python3.10/dist-packages (from keras-nlp==0.6.3) (23.2)
Requirement already satisfied: regex in /usr/local/lib/python3.10/dist-packages (from keras-nlp==0.6.3) (2023.6.3)
Requirement already satisfied: rich in /usr/local/lib/python3.10/dist-packages (from keras-nlp==0.6.3) (13.7.0)
Requirement already satisfied: dm-tree in /usr/local/lib/python3.10/dist-packages (from keras-nlp==0.6.3) (0.1.8)
Collecting tensorflow-text (from keras-nlp==0.6.3)
  Downloading tensorflow_text-2.15.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.2/5.2 MB 36.3 MB/s eta 0:00:00
Collecting namex (from keras-core->keras-nlp==0.6.3)
  Downloading namex-0.0.7-py3-none-any.whl (5.8 kB)
Requirement already satisfied: h5py in /usr/local/lib/python3.10/dist-packages (from keras-core->keras-nlp==0.6.3) (3.9.0)
Requirement already satisfied: markdown-it-py>=2.2.0 in /usr/local/lib/python3.10/dist-packages (from rich->keras-nlp==0.6.3) (3.0.0)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /usr/local/lib/python3.10/dist-packages (from rich->keras-nlp==0.6.3) (2.16.1)
Requirement already satisfied: tensorflow-hub>=0.13.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow-text->keras-nlp==0.6.3) (0.15.0)
Collecting tensorflow<2.16,>=2.15.0 (from tensorflow-text->keras-nlp==0.6.3)
  Downloading tensorflow-2.15.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (475.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 475.2/475.2 MB 3.2 MB/s eta 0:00:00
Requirement already satisfied: mdurl~=0.1 in /usr/local/lib/python3.10/dist-packages (from markdown-it-py>=2.2.0->rich->keras-nlp==0.6.3) (0.1.2)
Requirement already satisfied: astunparse>=1.6.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (1.6.3)
Requirement already satisfied: flatbuffers>=23.5.26 in /usr/local/lib/python3.10/dist-packages (from tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (23.5.26)
Requirement already satisfied: gast!=0.5.0,!=0.5.1,!=0.5.2,>=0.2.1 in /usr/local/lib/python3.10/dist-packages (from tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (0.5.4)
Requirement already satisfied: google-pasta>=0.1.1 in /usr/local/lib/python3.10/dist-packages (from tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (0.2.0)
Requirement already satisfied: libclang>=13.0.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (16.0.6)
Requirement already satisfied: ml-dtypes~=0.2.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (0.2.0)
Requirement already satisfied: opt-einsum>=2.3.2 in /usr/local/lib/python3.10/dist-packages (from tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (3.3.0)
Requirement already satisfied: protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.20.3 in /usr/local/lib/python3.10/dist-packages (from tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (3.20.3)
Requirement already satisfied: setuptools in /usr/local/lib/python3.10/dist-packages (from tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (67.7.2)
Requirement already satisfied: six>=1.12.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (1.16.0)
Requirement already satisfied: termcolor>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (2.3.0)
Requirement already satisfied: typing-extensions>=3.6.6 in /usr/local/lib/python3.10/dist-packages (from tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (4.5.0)
Requirement already satisfied: wrapt<1.15,>=1.11.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (1.14.1)
Requirement already satisfied: tensorflow-io-gcs-filesystem>=0.23.1 in /usr/local/lib/python3.10/dist-packages (from tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (0.34.0)
Requirement already satisfied: grpcio<2.0,>=1.24.3 in /usr/local/lib/python3.10/dist-packages (from tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (1.59.2)
Collecting tensorboard<2.16,>=2.15 (from tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3)
  Downloading tensorboard-2.15.1-py3-none-any.whl (5.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.5/5.5 MB 110.2 MB/s eta 0:00:00
Collecting tensorflow-estimator<2.16,>=2.15.0 (from tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3)
  Downloading tensorflow_estimator-2.15.0-py2.py3-none-any.whl (441 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 442.0/442.0 kB 47.4 MB/s eta 0:00:00
Collecting keras<2.16,>=2.15.0 (from tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3)
  Downloading keras-2.15.0-py3-none-any.whl (1.7 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.7/1.7 MB 87.0 MB/s eta 0:00:00
Requirement already satisfied: wheel<1.0,>=0.23.0 in /usr/local/lib/python3.10/dist-packages (from astunparse>=1.6.0->tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (0.41.3)
Requirement already satisfied: google-auth<3,>=1.6.3 in /usr/local/lib/python3.10/dist-packages (from tensorboard<2.16,>=2.15->tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (2.17.3)
Requirement already satisfied: google-auth-oauthlib<2,>=0.5 in /usr/local/lib/python3.10/dist-packages (from tensorboard<2.16,>=2.15->tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (1.0.0)
Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.10/dist-packages (from tensorboard<2.16,>=2.15->tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (3.5.1)
Requirement already satisfied: requests<3,>=2.21.0 in /usr/local/lib/python3.10/dist-packages (from tensorboard<2.16,>=2.15->tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (2.31.0)
Requirement already satisfied: tensorboard-data-server<0.8.0,>=0.7.0 in /usr/local/lib/python3.10/dist-packages (from tensorboard<2.16,>=2.15->tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (0.7.2)
Requirement already satisfied: werkzeug>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from tensorboard<2.16,>=2.15->tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (3.0.1)
Requirement already satisfied: cachetools<6.0,>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from google-auth<3,>=1.6.3->tensorboard<2.16,>=2.15->tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (5.3.2)
Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.10/dist-packages (from google-auth<3,>=1.6.3->tensorboard<2.16,>=2.15->tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (0.3.0)
Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.10/dist-packages (from google-auth<3,>=1.6.3->tensorboard<2.16,>=2.15->tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (4.9)
Requirement already satisfied: requests-oauthlib>=0.7.0 in /usr/local/lib/python3.10/dist-packages (from google-auth-oauthlib<2,>=0.5->tensorboard<2.16,>=2.15->tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (1.3.1)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests<3,>=2.21.0->tensorboard<2.16,>=2.15->tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests<3,>=2.21.0->tensorboard<2.16,>=2.15->tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests<3,>=2.21.0->tensorboard<2.16,>=2.15->tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests<3,>=2.21.0->tensorboard<2.16,>=2.15->tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (2023.7.22)
Requirement already satisfied: MarkupSafe>=2.1.1 in /usr/local/lib/python3.10/dist-packages (from werkzeug>=1.0.1->tensorboard<2.16,>=2.15->tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (2.1.3)
Requirement already satisfied: pyasn1<0.6.0,>=0.4.6 in /usr/local/lib/python3.10/dist-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard<2.16,>=2.15->tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (0.5.0)
Requirement already satisfied: oauthlib>=3.0.0 in /usr/local/lib/python3.10/dist-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<2,>=0.5->tensorboard<2.16,>=2.15->tensorflow<2.16,>=2.15.0->tensorflow-text->keras-nlp==0.6.3) (3.2.2)
Installing collected packages: namex, tensorflow-estimator, keras, keras-core, tensorboard, tensorflow, tensorflow-text, keras-nlp
  Attempting uninstall: tensorflow-estimator
    Found existing installation: tensorflow-estimator 2.14.0
    Uninstalling tensorflow-estimator-2.14.0:
      Successfully uninstalled tensorflow-estimator-2.14.0
  Attempting uninstall: keras
    Found existing installation: keras 2.14.0
    Uninstalling keras-2.14.0:
      Successfully uninstalled keras-2.14.0
  Attempting uninstall: tensorboard
    Found existing installation: tensorboard 2.14.1
    Uninstalling tensorboard-2.14.1:
      Successfully uninstalled tensorboard-2.14.1
  Attempting uninstall: tensorflow
    Found existing installation: tensorflow 2.14.0
    Uninstalling tensorflow-2.14.0:
      Successfully uninstalled tensorflow-2.14.0
Successfully installed keras-2.15.0 keras-core-0.1.7 keras-nlp-0.6.3 namex-0.0.7 tensorboard-2.15.1 tensorflow-2.15.0 tensorflow-estimator-2.15.0 tensorflow-text-2.15.0

It's replacing tensorflow with another version. It wasn't doing this 4 days ago.

joaopauloschuler avatar Nov 18 '23 03:11 joaopauloschuler

In the case it helps, in my case, I was able to bypass the problem with: !pip install keras-nlp keras-core tensorflow-text --no-deps

joaopauloschuler avatar Nov 19 '23 07:11 joaopauloschuler

In a Jupyter Notebook environment (which is in GCP Docker Image : tensorflow/tensorflow:2.14.0-gpu-jupyter) , the problem is not solved by the method mentioned above. Is there any other solution available?


!pip install keras-nlp==0.6.2 !pip install tensorflow==2.14.0

simple method, but it worked for me

dellaanima avatar Nov 20 '23 02:11 dellaanima

!apt-get install cuda-toolkit

Adding the above before installing keras-nlp solved for me.

joaopauloschuler avatar Nov 22 '23 01:11 joaopauloschuler

Thanks for filing! I would guess the issue is that colab currently ships tensorflow==2.14 and pip install keras-nlp will by default pull in the latest stable tensorflow-text and tensorflow (2.15), which uses a entirely different version of cuda.

The easy approach would just be to keep tensorflow pinned as people are suggesting above... pip install keras-nlp tensorflow-text~=2.14.0 tensorflow~=2.14.0 for now, until colab updates it's verison of tensorflow/cuda. Alternately, there may be a way to configure colab to use tensorflow 2.15 with GPU support, I am not sure.

Regardless, I don't think this is really something we can fix on keras-nlp's side. As you would probably see the exact same behavior simply running pip install -U tensorflow.

mattdangerw avatar Nov 26 '23 02:11 mattdangerw

I've no issue in Colab with GPU runtime and the following packages:

!pip install git+https://github.com/keras-team/keras-nlp.git keras-core tensorflow[and-cuda] --upgrade
import keras_core as keras
import tensorflow as tf
import keras_nlp
print(keras.__version__,tf.__version__,keras_nlp.__version__)

alessiosavi avatar Nov 29 '23 09:11 alessiosavi

Hi @joaopauloschuler,

Thanks for reporting this. keras-nlp has been renamed to keras-hub, you can install keras-hub using
 !pip install -Uq keras-hub. The code seems to be working fine in the latest keras-hub release. Attaching gist.

Please close the issue if it doesn't reproduce anymore. Thanks!

dhantule avatar Dec 04 '24 06:12 dhantule

This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.

github-actions[bot] avatar Feb 14 '25 02:02 github-actions[bot]