guidance
guidance copied to clipboard
Unable to load models with Llamacpp
The bug A clear and concise description of what the bug is.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
[/teamspace/studios/this_studio/eval.ipynb](https://vscode-remote+vscode-002d01hraxt48damrgfwy9ven0ye4s-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/teamspace/studios/this_studio/eval.ipynb) Cell 3 line 4
[1](vscode-notebook-cell://vscode-01hraxt48damrgfwy9ven0ye4s.studio.lightning.ai/teamspace/studios/this_studio/eval.ipynb#W2sdnNjb2RlLXJlbW90ZQ%3D%3D?line=0) from guidance import models, gen, select
[3](vscode-notebook-cell://vscode-01hraxt48damrgfwy9ven0ye4s.studio.lightning.ai/teamspace/studios/this_studio/eval.ipynb#W2sdnNjb2RlLXJlbW90ZQ%3D%3D?line=2) path = "mistral-7b-instruct-v0.2.Q8_0.gguf"
----> [4](vscode-notebook-cell://vscode-01hraxt48damrgfwy9ven0ye4s.studio.lightning.ai/teamspace/studios/this_studio/eval.ipynb#W2sdnNjb2RlLXJlbW90ZQ%3D%3D?line=3) llm = models.LlamaCpp(model=path, n_ctx=4096, n_gpu_layers=-1)
[6](vscode-notebook-cell://vscode-01hraxt48damrgfwy9ven0ye4s.studio.lightning.ai/teamspace/studios/this_studio/eval.ipynb#W2sdnNjb2RlLXJlbW90ZQ%3D%3D?line=5) from llama_cpp import Llama
[7](vscode-notebook-cell://vscode-01hraxt48damrgfwy9ven0ye4s.studio.lightning.ai/teamspace/studios/this_studio/eval.ipynb#W2sdnNjb2RlLXJlbW90ZQ%3D%3D?line=6) llm = Llama(
[8](vscode-notebook-cell://vscode-01hraxt48damrgfwy9ven0ye4s.studio.lightning.ai/teamspace/studios/this_studio/eval.ipynb#W2sdnNjb2RlLXJlbW90ZQ%3D%3D?line=7) model_path=path,
[9](vscode-notebook-cell://vscode-01hraxt48damrgfwy9ven0ye4s.studio.lightning.ai/teamspace/studios/this_studio/eval.ipynb#W2sdnNjb2RlLXJlbW90ZQ%3D%3D?line=8) n_gpu_layers=-1,
[10](vscode-notebook-cell://vscode-01hraxt48damrgfwy9ven0ye4s.studio.lightning.ai/teamspace/studios/this_studio/eval.ipynb#W2sdnNjb2RlLXJlbW90ZQ%3D%3D?line=9) )
File [/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/guidance/models/llama_cpp/_llama_cpp.py:74](https://vscode-remote+vscode-002d01hraxt48damrgfwy9ven0ye4s-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/guidance/models/llama_cpp/_llama_cpp.py:74), in LlamaCpp.__init__(self, model, tokenizer, echo, compute_log_probs, caching, temperature, **kwargs)
[71](https://vscode-remote+vscode-002d01hraxt48damrgfwy9ven0ye4s-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/guidance/models/llama_cpp/_llama_cpp.py:71) else:
[72](https://vscode-remote+vscode-002d01hraxt48damrgfwy9ven0ye4s-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/guidance/models/llama_cpp/_llama_cpp.py:72) raise TypeError("model must be None, a file path string, or a llama_cpp.Llama object.")
---> [74](https://vscode-remote+vscode-002d01hraxt48damrgfwy9ven0ye4s-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/guidance/models/llama_cpp/_llama_cpp.py:74) self._context = _LlamaBatchContext(self.model_obj.n_batch, self.model_obj.n_ctx())
[76](https://vscode-remote+vscode-002d01hraxt48damrgfwy9ven0ye4s-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/guidance/models/llama_cpp/_llama_cpp.py:76) if tokenizer is None:
[77](https://vscode-remote+vscode-002d01hraxt48damrgfwy9ven0ye4s-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/guidance/models/llama_cpp/_llama_cpp.py:77) tokenizer = llama_cpp.LlamaTokenizer(self.model_obj)
File [/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/guidance/models/llama_cpp/_llama_cpp.py:23](https://vscode-remote+vscode-002d01hraxt48damrgfwy9ven0ye4s-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/guidance/models/llama_cpp/_llama_cpp.py:23), in _LlamaBatchContext.__init__(self, n_batch, n_ctx)
[21](https://vscode-remote+vscode-002d01hraxt48damrgfwy9ven0ye4s-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/guidance/models/llama_cpp/_llama_cpp.py:21) def __init__(self, n_batch, n_ctx):
[22](https://vscode-remote+vscode-002d01hraxt48damrgfwy9ven0ye4s-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/guidance/models/llama_cpp/_llama_cpp.py:22) self._llama_batch_free = llama_cpp.llama_batch_free
---> [23](https://vscode-remote+vscode-002d01hraxt48damrgfwy9ven0ye4s-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/guidance/models/llama_cpp/_llama_cpp.py:23) self.batch = llama_cpp.llama_batch_init(n_tokens=n_batch, embd=0, n_seq_max=n_ctx)
[24](https://vscode-remote+vscode-002d01hraxt48damrgfwy9ven0ye4s-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/guidance/models/llama_cpp/_llama_cpp.py:24) if self.batch is None:
[25](https://vscode-remote+vscode-002d01hraxt48damrgfwy9ven0ye4s-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/guidance/models/llama_cpp/_llama_cpp.py:25) raise Exception("call to llama_cpp.llama_batch_init returned NULL.")
TypeError: this function takes at least 3 arguments (0 given)
To Reproduce Give a full working code snippet that can be pasted into a notebook cell or python file. Make sure to include the LLM load step so we know which model you are using.
from guidance import models, gen, select
path = "mistral-7b-instruct-v0.2.Q8_0.gguf"
llm = models.LlamaCpp(model=path, n_ctx=4096, n_gpu_layers=-1)
System info (please complete the following information):
- OS (e.g. Ubuntu, Windows 11, Mac OS, etc.):
- Guidance Version (
guidance.__version__
):
I was facing the same error, install llama-cpp-python==0.2.26 and it should work!
Using an older llama-cpp version works but limits the usage of some newer models. For example, you can't load stabilityai/stablelm-2-zephyr-1_6b
on 0.2.26. We need the team to bump compatibility, ideally to the latest version :)
I think this issue is resolved with PR https://github.com/guidance-ai/guidance/pull/665, but there hasn't been a release since then.
but there hasn't been a release since then.
Should we expect a release any time soon or should I simply cherry pick the change into a local fork?
EDIT: @paulbkoch maybe consider offering a nightly version on pypi that reflects the latest state of development without requiring us to pip install from github?
Any hope for a release to pypi soon with this fix?
https://github.com/guidance-ai/guidance/discussions/692
but there hasn't been a release since then.
Should we expect a release any time soon or should I simply cherry pick the change into a local fork?
EDIT: @paulbkoch maybe consider offering a nightly version on pypi that reflects the latest state of development without requiring us to pip install from github?
Is the pull request merged for installing directly from github? Or is there a way to install directly from github and pull in the pull request at the same time?