guidance
guidance copied to clipboard
assert token_byte_positions[-1] == last_pos on gen()
The bug There is the following error when trying to generate with certain models. The error is assert token_byte_positions[-1] == last_pos on gen() and can be reproduced using the code below. Note this bug only applies on certain LLMs
To Reproduce It fails on this LLM
# Imports
import guidance
from guidance import image
from guidance import user, assistant, system
from guidance import gen, select
from guidance import capture, Tool, regex
# Paths
path_tess = "/home/sr/Desktop/CloserModels/tess-34b-v1.5b.Q5_K_M.gguf"
# Models
model = guidance.models.LlamaCpp(path_tess, n_gpu_layers=-1, n_ctx=2048)
llama2 = model
lm = llama2 + "Explain Oppenheimer's contribution to the world" + gen(name="output")
Error
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
Cell In[1], [line 15](vscode-notebook-cell:?execution_count=1&line=15)
[12](vscode-notebook-cell:?execution_count=1&line=12) model = guidance.models.LlamaCpp(path_tess, n_gpu_layers=-1, n_ctx=2048)
[13](vscode-notebook-cell:?execution_count=1&line=13) llama2 = model
---> [15](vscode-notebook-cell:?execution_count=1&line=15) lm = llama2 + "Explain Oppenheimer's contribution to the world" + gen(name="output")
File [~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:302](https://file+.vscode-resource.vscode-cdn.net/home/sr/Desktop/Programs/Tests/~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:302), in Model.__add__(self, value)
[300](https://file+.vscode-resource.vscode-cdn.net/home/sr/Desktop/Programs/Tests/~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:300) # run stateless functions (grammar nodes)
[301](https://file+.vscode-resource.vscode-cdn.net/home/sr/Desktop/Programs/Tests/~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:301) elif isinstance(value, StatelessFunction):
--> [302](https://file+.vscode-resource.vscode-cdn.net/home/sr/Desktop/Programs/Tests/~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:302) out = lm._run_stateless(value)
[304](https://file+.vscode-resource.vscode-cdn.net/home/sr/Desktop/Programs/Tests/~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:304) # run stateful functions
[305](https://file+.vscode-resource.vscode-cdn.net/home/sr/Desktop/Programs/Tests/~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:305) else:
[306](https://file+.vscode-resource.vscode-cdn.net/home/sr/Desktop/Programs/Tests/~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:306) out = value(lm)
File [~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:465](https://file+.vscode-resource.vscode-cdn.net/home/sr/Desktop/Programs/Tests/~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:465), in Model._run_stateless(lm, stateless_function, temperature, top_p, n)
[463](https://file+.vscode-resource.vscode-cdn.net/home/sr/Desktop/Programs/Tests/~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:463) delayed_bytes = b""
[464](https://file+.vscode-resource.vscode-cdn.net/home/sr/Desktop/Programs/Tests/~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:464) # last_is_generated = False
--> [465](https://file+.vscode-resource.vscode-cdn.net/home/sr/Desktop/Programs/Tests/~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:465) for new_bytes, is_generated, new_bytes_prob, capture_groups, capture_group_log_probs, new_token_count in gen_obj:
[466](https://file+.vscode-resource.vscode-cdn.net/home/sr/Desktop/Programs/Tests/~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:466)
[467](https://file+.vscode-resource.vscode-cdn.net/home/sr/Desktop/Programs/Tests/~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:467) # we make everything full probability if we are not computing uncertainty
[468](https://file+.vscode-resource.vscode-cdn.net/home/sr/Desktop/Programs/Tests/~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:468) if not lm.compute_log_probs:
[469](https://file+.vscode-resource.vscode-cdn.net/home/sr/Desktop/Programs/Tests/~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:469) new_bytes_prob = 1.0
File [~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:638](https://file+.vscode-resource.vscode-cdn.net/home/sr/Desktop/Programs/Tests/~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:638), in Model.__call__(self, grammar, max_tokens, n, top_p, temperature, ensure_bos_token)
[636](https://file+.vscode-resource.vscode-cdn.net/home/sr/Desktop/Programs/Tests/~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:636) # run a simple tokenizer (that does not use a grammar) on the prefix for better performance
[637](https://file+.vscode-resource.vscode-cdn.net/home/sr/Desktop/Programs/Tests/~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:637) token_ids,token_byte_positions = self._tokenize_prefix(prompt)
--> [638](https://file+.vscode-resource.vscode-cdn.net/home/sr/Desktop/Programs/Tests/~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:638) token_ids,token_byte_positions = self._cleanup_tokens(token_ids,token_byte_positions)
[639](https://file+.vscode-resource.vscode-cdn.net/home/sr/Desktop/Programs/Tests/~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:639) if len(token_byte_positions) > 0:
[640](https://file+.vscode-resource.vscode-cdn.net/home/sr/Desktop/Programs/Tests/~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:640) pre_parser_bytes = token_byte_positions[-1]
File [~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:611](https://file+.vscode-resource.vscode-cdn.net/home/sr/Desktop/Programs/Tests/~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:611), in Model._cleanup_tokens(self, token_ids, token_byte_positions)
[609](https://file+.vscode-resource.vscode-cdn.net/home/sr/Desktop/Programs/Tests/~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:609) for i in range(1, len(token_byte_positions)):
[610](https://file+.vscode-resource.vscode-cdn.net/home/sr/Desktop/Programs/Tests/~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:610) token_byte_positions[i] -= 1
--> [611](https://file+.vscode-resource.vscode-cdn.net/home/sr/Desktop/Programs/Tests/~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:611) assert token_byte_positions[-1] == last_pos
[613](https://file+.vscode-resource.vscode-cdn.net/home/sr/Desktop/Programs/Tests/~/anaconda3/envs/llamacpptest34/lib/python3.11/site-packages/guidance/models/_model.py:613) return token_ids, token_byte_positions
AssertionError:
But it works on this LLM
# Imports
import guidance
from guidance import image
from guidance import user, assistant, system
from guidance import gen, select
from guidance import capture, Tool, regex
# Paths
path_tess = "/home/sr/Desktop/CloserModels/bagel-dpo-7b-v0.4.Q8_0.gguf"
# Models
model = guidance.models.LlamaCpp(path_tess, n_gpu_layers=-1, n_ctx=2048)
llama2 = model
lm = llama2 + "Explain Oppenheimer's contribution to the world" + gen(name="output")
Output Explain Oppenheimer's contribution to the world of physics.
Oppenheimer's contribution to the world of physics is significant and far-reaching. He was a key figure in the development of quantum mechanics and the Manhattan Project, which led to the creation of the atomic bomb. Oppenheimer's work on the Uncertainty Principle, which states that it is impossible to simultaneously measure the exact position and momentum of a particle, was groundbreaking and helped shape the field of quantum mechanics. Additionally, his leadership of the Los Alamos National Laboratory during the Manhattan Project was crucial in the development and deployment of the atomic bomb. Oppenheimer's contributions to physics have had a profound impact on our understanding of the universe and the potential for human innovation.
System info (please complete the following information):
- Ubuntu 22.04
- Guidance Version (0.1.10):
Facing the same problem with models.Transformers mistral models.
Any progress on this? Having the same problem on mistral-7b-instruct-v0.2.Q8_0.gguf -.-
Same problem with 0.1.13 and 0.1.16