neural-style
neural-style copied to clipboard
There is wrong about running the command nvidia-smi
My system is ubuntu 14.04 on win10.I have installed the CUDA 8.0 ,but when i input the command nvidia-smi,there is an error like this:
modprobe: ERROR: ../libkmod/libkmod.c:556 kmod_search_moddep() could not open moddep file '/lib/modules/3.4.0+/modules.dep.bin' modprobe: ERROR: ../libkmod/libkmod.c:556 kmod_search_moddep() could not open moddep file '/lib/modules/3.4.0+/modules.dep.bin' modprobe: ERROR: ../libkmod/libkmod-module.c:809 kmod_module_insert_module() could not find module by name='nvidia_367' modprobe: ERROR: could not insert 'nvidia_367': Function not implemented NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
how to solve it?
The installation of nvidia driver has failed. From this it is impossible to know why. One reason could be missing kernel headers. NVIDIA documentation and forum are best sources for help for this kind of problem.
@htoyryla I have try to install the Navidia Driver,but it say that can't find kernel 3.4.0+.I use the command apt-cache search linux-headers,but it seems have no the headers file of linux-headers-3.4.0+
Where did you get this 3.4.0+ from? It does not sound right. Which kernel do you have? What does uname -r give? I would guess 3.13.0-something.
The installation guide at http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#axzz4MOGxcY2j gives the following supported versions.
Ubuntu 16.04 4.4.0 Ubuntu 14.04 3.13
Furthermore from the guide, a helpful command to install the correct kernel headers:
Ubuntu
The kernel headers and development packages for the currently running kernel can be installed with:
$ sudo apt-get install linux-headers-$(uname -r)
I am running 14.04 with a 4.4.0 kernel (although still cuda 8.0rc, have not yet moved to 8.0).
@htoyryla Here is some info of my system.So what is the problem?I get this system from win10.
Can't really help anymore. You seem to have a 3.4.0 kernel but instead of a precise version number, it has this +. Google doesn't either give anything helpful on "kernel 3.4.0+".
Anyhow, 3.4.0 is old already and might not even work with cuda 8.0 (the docs state 3.13 and 4.4.0).
Just realized (based on another issue) that you are running on Windows bash? Others, too, seem to complaining of not finding the kernel header for 3.4.0. I wonder whether it is possible to update the kernel in Windows bash?
Update: I guess you are stuck to that kernel. Saw somewhere that "the Ubuntu userspace is running not on a Linux kernel, but WSL. WSL provides the API hooks to look like Linux to Ubuntu and Linux applications, but it's not the same thing." So one cannot replace it with another linux kernel, one is stuck with whatever Microsoft provides.
yes,I'm runnning on Win10's bash.Thank u a lot.but I don't konw how to change my system's kernel..
it's so hard to change the kernel. I can only use the lbfgs mehtod.
@251099155 hello, i meet the same problem, if you had solved it?
I meet a similarity problem. my system is redhat7 ,my GPU is K80 in Azure .my cuda is cuda8. my error is NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
solve it by modprobe nvidia
then every thing is OK . my Gold.
@Matrixsun I have the same issue, but when I do modprobe nvidia I received this error: "modprobe: ERROR: could not insert 'nvidia_375': No such device".
Would you mind elaborate more on how do you solve "NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver"
my setting, Ubuntu 14.04, K80, in Azure
@rizkyario sorry for my simple. Details as follows:
Azure's Linux system lacks a lot of basic packages. My solution is one by one to install those packages I need before the installation of cuda.
1、install GCC
yum install gcc*
2、install dkms
wget http://dl.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-9.noarch.rpm
rpm -ivh epel-release-7-9.noarch.rpm
yum install --enablerepo=epel dkms
3、Install kernel related components
yum install kernel*
4、install cuda
5、modprobe nvidia
then it is OK .
I am trying to run NVIDIA driver on AWS running with Ubuntu 16.04. I have installed kernel headers and successfully installed the NVIDIA-375.26 driver that comes along with official CUDA-8.0 release. However I keep getting "NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running", no matter how many times I reinstall, run sudo update-initramfs -u
and reboot. When I tried the method recommended by @Matrixsun, I got the following error: "modprobe: FATAL: Module nvidia not found in directory /lib/modules/4.4.0-1020-aws". What do I do to resolve this?
I am trying to run NVIDIA driver on AWS running with Ubuntu 16.04
What instance type?
p2.xlarge with NVIDIA Tesla K80