outlines
outlines copied to clipboard
LlamaCpp - Logits Processor never ends
Describe the issue as clearly as possible:
Generating logits with LlamaCpp using the Logits Processor fails to terminate and to generate the last character '}', see example:
Steps/code to reproduce the bug:
# %%
from enum import Enum
from llama_cpp import Llama, LogitsProcessorList
from pydantic import BaseModel, constr
from outlines.integrations.llamacpp import JSONLogitsProcessor
# %%
class Weapon(str, Enum):
sword = "sword"
axe = "axe"
mace = "mace"
spear = "spear"
bow = "bow"
crossbow = "crossbow"
class Armor(str, Enum):
leather = "leather"
chainmail = "chainmail"
plate = "plate"
class Character(BaseModel):
name: constr(min_length=3, max_length=10)
age: int
armor: Armor
weapon: Weapon
strength: int
# %%
llama = Llama("TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF/mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf", n_gpu_layers=999)
# %%
prompt = "Instruct: You are a leading role play gamer. You have seen thousands of different characters and their attributes.\nPlease return a JSON object with common attributes of an RPG character. Give me a character description\nOutput:"
logits_processor = JSONLogitsProcessor(Character, llama)
json_str = llama.create_completion(
prompt,
top_k=40,
top_p=0.95,
temperature=0.7,
max_tokens=100,
logits_processor=LogitsProcessorList([logits_processor]),
)["choices"][0]["text"]
print(json_str)
print(Character.model_validate_json(json_str))
Expected result:
{"name":"RPG_CHAR","age":25,"armor": "leather","weapon":"sword", "strength":10}
but results are most often:
{"name":"RPG_CHAR","age":25,"armor": "leather","weapon":"sword", "strength":1000000000000000000000000000000000000000000000000000000000000000000
Error message:
No response
Outlines/Python version information:
Version information
```
0.0.36
Python 3.11.8 (main, Feb 26 2024, 15:36:12) [Clang 14.0.6 ]
```
Context for the issue:
This makes Outlines unusable with the LlamaCpp backend.