cog icon indicating copy to clipboard operation
cog copied to clipboard

Building cog without GPU fails

Open skerit opened this issue 6 months ago • 4 comments

One of the reasons I want to use Replicate is that I don't have an nvidia GPU. But for some reason I'm unable to push a model without actually having an nvidia GPU, since the "validating" step tries to run the generated docker file at least once, I assume?

Can't I just skip this step?

The error message:

Validating model schema...

Traceback (most recent call last):
  File "/root/.pyenv/versions/3.11.7/lib/python3.11/site-packages/llama_cpp/llama_cpp.py", line 74, in _load_shared_library
    return ctypes.CDLL(str(_lib_path), **cdll_args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.pyenv/versions/3.11.7/lib/python3.11/ctypes/__init__.py", line 376, in __init__
    self._handle = _dlopen(self._name, mode)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: libcuda.so.1: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/root/.pyenv/versions/3.11.7/lib/python3.11/site-packages/cog/command/openapi_schema.py", line 18, in <module>
    app = create_app(config, shutdown_event=None)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.pyenv/versions/3.11.7/lib/python3.11/site-packages/cog/server/http.py", line 72, in create_app
    predictor = load_predictor_from_ref(predictor_ref)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.pyenv/versions/3.11.7/lib/python3.11/site-packages/cog/predictor.py", line 172, in load_predictor_from_ref
    spec.loader.exec_module(module)
  File "<frozen importlib._bootstrap_external>", line 940, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/src/predict.py", line 2, in <module>
    from llama_cpp import Llama
  File "/root/.pyenv/versions/3.11.7/lib/python3.11/site-packages/llama_cpp/__init__.py", line 1, in <module>
    from .llama_cpp import *
  File "/root/.pyenv/versions/3.11.7/lib/python3.11/site-packages/llama_cpp/llama_cpp.py", line 87, in <module>
    _lib = _load_shared_library(_lib_base_name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.pyenv/versions/3.11.7/lib/python3.11/site-packages/llama_cpp/llama_cpp.py", line 76, in _load_shared_library
    raise RuntimeError(f"Failed to load shared library '{_lib_path}': {e}")
RuntimeError: Failed to load shared library '/root/.pyenv/versions/3.11.7/lib/python3.11/site-packages/llama_cpp/libllama.so': libcuda.so.1: cannot open shared object file: No such file or directory

ⅹ Failed to get type signature: exit status 1

skerit avatar Dec 31 '23 13:12 skerit

+1 on this. I've been trying to run on my M1 mac and it just doesn't want to play.

sambowenhughes avatar Jan 09 '24 02:01 sambowenhughes

@skerit move the import to setup step to circumvent the issue. Hacky but at least you'll be able to deploy the llama_cpp python pipeline. For example -

class Predictor(BasePredictor): def setup(self): from llama_cpp import LlamaGrammar, Llama self.LlamaGrammar = LlamaGrammar model_path = "/models/MODEL_NAME" self.llm = Llama( model_path, n_ctx=2048, n_gpu_layers=-1, main_gpu=0, n_threads=1 )

akshagu avatar Jan 12 '24 04:01 akshagu

Same problem on my Ubuntu 22. No GPU and want to deploy without a model validation

Slyracoon23 avatar Apr 27 '24 23:04 Slyracoon23