outlines
outlines copied to clipboard
Second call to generator created with generate.json fails with Exception
Describe the issue as clearly as possible:
See code below, second call fails with Exception:
IndexError: list index out of range
Steps/code to reproduce the bug:
from outlines import models, generate, samplers
import json
model = models.mlxlm("mlx-community/Meta-Llama-3.1-8B-Instruct-4bit")
sampler = samplers.multinomial( top_p=0.1 )
generator = generate.json( model, RESULTS_JSON_SCHEMA, sampler )
json_answer = generator( f"{markdown_doc_az} \n\n List all the associations mentioned in the markdown document above.", max_tokens=5000 )
print( "\n\n\n", json.dumps( json_answer, indent=4 ) )
print( "\n\n++++++++ SECOND RUN! +++++++++\n\n" )
# Does NOT work without regenerating the generator, which is a bug. Should be reusable
# generator = generate.json( model, RESULTS_JSON_SCHEMA, sampler )
json_answer = generator( f"{markdown_doc_az} \n\n List all the addresses mentioned in the markdown document above ", max_tokens=1000 )
Expected result:
No exceptions
Error message:
File lib/python3.12/site-packages/outlines/models/mlxlm.py:41, in MLXLM.generate(self, prompts, generation_parameters, logits_processor, sampling_parameters)
[31](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/models/mlxlm.py:31) def generate(
[32](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/models/mlxlm.py:32) self,
[33](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/models/mlxlm.py:33) prompts: Union[str, List[str]],
(...)
[36](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/models/mlxlm.py:36) sampling_parameters: "SamplingParameters",
[37](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/models/mlxlm.py:37) ) -> str:
[38](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/models/mlxlm.py:38) streamer = self.stream(
[39](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/models/mlxlm.py:39) prompts, generation_parameters, logits_processor, sampling_parameters
[40](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/models/mlxlm.py:40) )
---> [41](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/models/mlxlm.py:41) return "".join(list(streamer))
File lib/python3.12/site-packages/outlines/models/mlxlm.py:109, in MLXLM.stream(self, prompts, generation_parameters, logits_processor, sampling_parameters)
[105](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/models/mlxlm.py:105) # Adapted from
[106](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/models/mlxlm.py:106) # https://github.com/ml-explore/mlx-examples/blob/4872727/llms/mlx_lm/utils.py#L267
[107](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/models/mlxlm.py:107) prompt_tokens = mx.array(self.mlx_tokenizer.encode(prompts))
--> [109](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/models/mlxlm.py:109) for (token, prob), n in zip(
[110](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/models/mlxlm.py:110) self.generate_step(prompt_tokens, **generate_kwargs),
[111](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/models/mlxlm.py:111) range(max_tokens),
[112](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/models/mlxlm.py:112) ):
[113](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/models/mlxlm.py:113) if token == self.tokenizer.eos_token_id:
[114](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/models/mlxlm.py:114) break
File lib/python3.12/site-packages/outlines/models/mlxlm.py:181, in MLXLM.generate_step(self, prompt, temp, top_p, sampler, logits_processor)
[178](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/models/mlxlm.py:178) if logits_processor is not None:
[179](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/models/mlxlm.py:179) # convert to logits_processor 1d expectation, apply, then convert back
[180](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/models/mlxlm.py:180) logits_1d = logits.reshape(-1)
--> [181](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/models/mlxlm.py:181) logits_1d = logits_processor(generated_ids, logits_1d)
[182](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/models/mlxlm.py:182) logits = logits_1d.reshape(1, -1)
[184](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/models/mlxlm.py:184) new_token_single, prob = sample(logits)
File lib/python3.12/site-packages/outlines/processors/base_logits_processor.py:68, in BaseLogitsProcessor.__call__(self, input_ids, logits)
[65](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/processors/base_logits_processor.py:65) import mlx.core as mx
[67](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/processors/base_logits_processor.py:67) torch_logits = torch.from_dlpack(logits)
---> [68](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/processors/base_logits_processor.py:68) processed_torch_logits = self.process_logits(input_ids, torch_logits)
[70](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/processors/base_logits_processor.py:70) # numpy doesn't support bfloat16, mlx doesn't support direct conversion from torch
[71](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/processors/base_logits_processor.py:71) logits_float32_numpy = processed_torch_logits.float().numpy()
File lib/python3.12/site-packages/outlines/processors/structured.py:90, in FSMLogitsProcessor.process_logits(self, input_ids, logits)
[88](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/processors/structured.py:88) self._is_first_token = False
[89](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/processors/structured.py:89) else:
---> [90](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/processors/structured.py:90) last_token = input_ids[-1]
[91](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/processors/structured.py:91) self._fsm_state = self.fsm.get_next_state(self._fsm_state, last_token)
[93](https://file+.vscode-resource.vscode-cdn.netlib/python3.12/site-packages/outlines/processors/structured.py:93) allowed_tokens = self.fsm.get_next_instruction(self._fsm_state).tokens
IndexError: list index out of range
Outlines/Python version information:
Version information
```
0.0.46
Python 3.12.4
```
Context for the issue:
No response