colab-tricks
colab-tricks copied to clipboard
nvidia-smi problem after SSH into Colab
Thank you for great tips about SSH into Colab at ngrok-tricks.ipynb.
I encountered one problem. For the command nvidia-smi
, in the Colab notebook:
Fri Mar 20 00:58:40 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.64.00 Driver Version: 418.67 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla P100-PCIE... Off | 00000000:00:04.0 Off | 0 |
| N/A 40C P0 28W / 250W | 0MiB / 16280MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Your runtime has 27.4 gigabytes of available RAM
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
But on my Mac, after SSH into Colab
$ ssh -p 10055 [email protected] -i ~/.ssh/id_rsa_colab
Welcome to Ubuntu 18.04.3 LTS (GNU/Linux 4.14.137+ x86_64)
.....
root@118ff34f0bd4:~# nvidia-smi; /usr/local/cuda/bin/nvcc --version
Failed to initialize NVML: Driver/library version mismatch
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
root@118ff34f0bd4:~#
Just found that even on the Colab
!which nvidia-smi
!sudo nvidia-smi
/usr/bin/nvidia-smi
Failed to initialize NVML: Driver/library version mismatch
So the problem came from using root for SSH login.