openchat
openchat copied to clipboard
Incomplete Output even with max_new_tokens
So the output of my fine-tuned open chat model ends abruptly and I ideally want it to complete the paragraph/sentences/code which it was it between. Although I have provided max_new_tokens = 300 and also in the prompt, I give to limit of 300 words.
The response is always big and ends abruptly. Any way I can ask for a complete output within the desired number of output tokens?
generation_config = GenerationConfig(
do_sample=True,
top_k=10,
temperature=0.01,
pad_token_id=tokenizer.eos_token_id,
early_stopping = True,
max_new_tokens=300,
return_full_text=False
)
max_new_tokens
limits the generated tokens of the model. If it outputs more than 300 tokens, the generation will stop. If you want shorter responses, you may prompt the model to give brief answers or fine-tune on short answers.