outlines
outlines copied to clipboard
Update of MLX-LM generate_step to support repetition_penalty
In order to fix https://github.com/outlines-dev/outlines/issues/1131, here is an updated mlxlm.py
with support of repetition_penalty
and repetition_context_size
parameters in generate_step
function.
It prevents the LLM model to fall into an infinite loop of generating the same group of tokens endlessly.
These parameters repetition_penalty
and repetition_context_size
can therefore be directly passed as arguments to the generator
. Here is an example:
sampler = samplers.multinomial( top_p=0.1 )
generator = generate.json( model, JSON_SCHEMA, sampler )
json_answer = generator( my_prompt, max_tokens=1000, repetition_penalty=1.1, repetition_context_size=20 )