Low level API example failed to run
I attempted to run a low-level API in version 0.2.11, but both installing from pypi and compiling from source failed. python: 3.10.12 llama_cpp_python: 0.2.11
{llama-cpp-python/examples/low_level_api}$ python low_level_api_llama_cpp.py
Traceback (most recent call last):
File ".../llama-cpp-python/examples/low_level_api/low_level_api_llama_cpp.py", line 15, in <module>
model = llama_cpp.llama_load_model_from_file(MODEL_PATH.encode('utf-8'), lparams)
File ".../llama-cpp-python/llama_cpp/llama_cpp.py", line 498, in llama_load_model_from_file
return _lib.llama_load_model_from_file(path_model, params)
ctypes.ArgumentError: argument 2: TypeError: expected llama_model_params instance instead of llama_context_params
@islwx Hello, if you find out the solution please write it here, i am struggling aswell. I think this is also connected to the server problem 500. Thanks.
@islwx Hello, if you find out the solution please write it here, i am struggling aswell. I think this is also connected to the server problem 500. Thanks.
I tried using an old version without any issues. This is obviously an issue where low-level API examples in the current version cannot keep up with version updates. Developers need to update the development examples and related documents.
same problem
Just updated from pip and getting the same issue. The fix is to use _llama_cpp.load_model_default_params()
self.lparams = llama_cpp.llama_context_default_params()
self.mparams = llama_cpp.llama_model_default_params()
self.model = llama_cpp.llama_load_model_from_file(model_path.encode('utf-8'), self.mparams)
self.ctx = llama_cpp.llama_new_context_with_model(self.model, self.lparams`
I modified the sample code in README.
llama_cpp_python: 0.2.90
import llama_cpp
import ctypes
llama_cpp.llama_backend_init(False) # Must be called once at the start of each program
lparams = llama_cpp.llama_context_default_params()
mparams = llama_cpp.llama_model_default_params()
# use bytes for char * params
model = llama_cpp.llama_load_model_from_file(b"./models/7b/llama-model.gguf", mparams)
ctx = llama_cpp.llama_new_context_with_model(model, lparams)
max_tokens = lparams.n_ctx
# use ctypes arrays for array params
tokens = (llama_cpp.llama_token * int(max_tokens))()
prompt = "Q: Name the planets in the solar system? A: "
pbytes = bytes(prompt, "utf-8")
n_tokens = llama_cpp.llama_tokenize(model, pbytes, len(pbytes), tokens, len(tokens), False, False)
llama_cpp.llama_free(ctx)
print(tokens[:n_tokens])
Equivalent:
from llama_cpp import Llama
llm = Llama("./models/7b/llama-model.gguf")
prompt = "Q: Name the planets in the solar system? A: "
tokens = llm.tokenize(bytes(prompt, "utf-8"), False)
print(tokens)