outlines Update of MLX-LM generate_step to support repetition

Update of MLX-LM generate_step to support repetition_penalty

Open ea167 opened this issue 5 months ago • 0 comments

In order to fix https://github.com/outlines-dev/outlines/issues/1131, here is an updated mlxlm.py with support of repetition_penalty and repetition_context_size parameters in generate_step function.

It prevents the LLM model to fall into an infinite loop of generating the same group of tokens endlessly.

These parameters repetition_penalty and repetition_context_size can therefore be directly passed as arguments to the generator. Here is an example:

sampler = samplers.multinomial( top_p=0.1 )
generator = generate.json( model, JSON_SCHEMA, sampler )
json_answer = generator( my_prompt, max_tokens=1000, repetition_penalty=1.1, repetition_context_size=20 )

Sep 06 '24 04:09 ea167

outlines outlines copied to clipboard

Update of MLX-LM generate_step to support repetition_penalty

outlines
outlines copied to clipboard