MinkowskiEngine icon indicating copy to clipboard operation
MinkowskiEngine copied to clipboard

GPU-enabled version can't run without drivers

Open dmzio opened this issue 5 years ago • 3 comments
trafficstars

For the purpose of transferability we use MinkowskiEngine inside the Docker (pytorch:1.5.1-cuda10.1-cudnn7-devel as base image, ME 0.4.3). With recent introduction of --force_cuda it builds prefectly, but issue arises during the use:

  • running container with GPU works well;
  • launching container without GPU results in failures like:
>>> import torch, MinkowskiEngine as ME
>>> ME.SparseTensor(torch.FloatTensor(0, 16), coords=torch.IntTensor(0, 4))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/conda/lib/python3.7/site-packages/MinkowskiEngine/SparseTensor.py", line 290, in __init__
    coords_manager = CoordsManager(D=coords.size(1) - 1)
  File "/opt/conda/lib/python3.7/site-packages/MinkowskiEngine/MinkowskiCoords.py", line 114, in __init__
    coords_man = MEB.CoordsManager(num_threads, memory_manager_backend)
RuntimeError:  CUDA driver version is insufficient for CUDA runtime version at /opt/MinkowskiEngine/src/gpu_memory_manager.hpp:57

so in short, GPU-compiled version can't be used if there's no card/drivers present (in case of PyTorch this works well by defaulting to CPU, expected same here).

Installing drivers into container during the build didn't help; CPU-only build works prefectly in this case.

Is it expected behavior? Is it possible to overcome it?

dmzio avatar Jul 20 '20 08:07 dmzio

Yes this is expected since you need a GPU-driver to actually use a GPU.

It is possible to overcome this, but I don't see any practical use case when you can simply install it with --cpu_only.

chrischoy avatar Aug 17 '20 23:08 chrischoy

thanks for reply, practical use case - transferable Docker images. In my particular case wanted to build GPU-enabled image and use it at all stages of CI/CD: clearly on many of them there is no GPU, while for prod deployment it is. So, now I need to take care about building different image versions for each stage of pipeline.

If "It is possible to overcome this" will appreciate any clues how to do that. (Of course, ideally it would be possible to just be able to run GPU enabled binary on CPU, as f.e. PyTorch does)

dmzio avatar Aug 18 '20 08:08 dmzio

I have the same use case. Is there a workaround for this in v0.5?

bzporter avatar Jan 08 '21 16:01 bzporter