neanderthal icon indicating copy to clipboard operation
neanderthal copied to clipboard

"minimal" example Dockerfile

Open behrica opened this issue 2 years ago • 10 comments

It setups MKL, CUDA and openCL based on Ubuntu 20.04 Dockerimage.

Maybe worth to keep and mention as an additional way of setting up of neanthertal.

I went for CUDA 11.4, so we need overwrite jcuda deps, as shown in deps.edn.minimal

behrica avatar Jun 23 '22 20:06 behrica

I am pretty sure cuda 11.4.1 will not work with Deep Diamond, as Nvidia changes API from time to time and several neural networks functions that I use (and perhaps many) simply do not exist in 11.4, or have different signature, or plainly work different under the hood. Neanderthal could work, but who knows. ClojureCUDA should work with 11.4. without problems, but I can't be sure.

blueberry avatar Jun 23 '22 20:06 blueberry

I thought that deep diamond is using neanderthal only.

So "all tests passing" is not enough as a test ?

behrica avatar Jun 23 '22 20:06 behrica

Cuda 11.6 has this CLIB issue we talked about before.

I still think it is useful to keep it "somewhere", as reference if somebody is struggling to set things up. (including for myself)

Maybe we can keep the idea of one or more "example Dockerfiles". This is for CUDA 11.4, all tests passisng.

I can try again with CUDA 11.6, as an other "example". So as a kind of "living instructions". Not sure you find that useful.

behrica avatar Jun 23 '22 20:06 behrica

As far as it helps any user, I find it useful. I'm just afraid that it can become a complex, but broken solution, to a much simpler problem. If 11.6 does not work due to old glibc the right solutions are:

  1. update the operating system to a more recent version with recent glibc, or, if that is not possible for whatever reason,
  2. build JCuda itself on the system with older glibc, and contribute that build upstream to JCuda.

blueberry avatar Jun 23 '22 20:06 blueberry

I will give it a try with CUDA 11.6. Its just a few changes in the Dockerfile, so we will see quickly if that works out.

Going to Ubuntu 22.04, means indeed CUDA 11.7, for which there is not (yet) a JCuda release.

At least I am learning a lot ...

behrica avatar Jun 23 '22 21:06 behrica

Maybe starting from here is even better:

https://hub.docker.com/layers/cuda/nvidia/cuda/11.6.1-runtime-ubuntu20.04/images/sha256-b59497e63c4d8cefac1152ceeb564830ed2f46e7d417c822a5813464a10394d2?context=explore

behrica avatar Jun 23 '22 21:06 behrica

It should be possible to instalall an earlier CUDA 11.6 on Ubuntu 22.04, it's just not the default. 11.7 came out fairly recently, I bet most of the cuda-dependent software that people run on ubuntu still needs earlier versions, so earlier versions should be available.

blueberry avatar Jun 23 '22 21:06 blueberry

The "official" NVIDIA downloads do not have it: https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/

behrica avatar Jun 23 '22 21:06 behrica

Ok, now its cristal clear. JCuda 11.6 is not working on ubuntu 20.04 (and likely on a lot of other distributions)

Using the modified Dockerfile (and the NVIDA Dockerimage) give:

/tmp/libJCudaDriver-11.6.1-linux-x86_64.so: /usr/lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /tmp/libJCudaDr iver-11.6.1-linux-x86_64.so)

As it was hinted before.

So Arch users are lucky, because we have (GNU libc) 2.35

behrica avatar Jun 23 '22 21:06 behrica

The "official" NVIDIA downloads do not have it: https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/

They do have it, as there is a separate vanilla linux distribution that covers distributions other than ubuntu and fedora, and these should be compatible with all linux distros, including ubuntu and fedora. The installer is run as a shell script, and arch linux and other distros not officially supported by nvidia wrap this from their package managers (Arch Linux as well).

blueberry avatar Jun 23 '22 22:06 blueberry