Open-Assistant
Open-Assistant copied to clipboard
Truncated assistant output
Hello, I am doing some tests on model OpenAssistant/stablelm-7b-sft-v7-epoch-3
via HuggingFace APIs and I am getting truncated assistant responses.
My use case is around extractive summarization, and my input messages have a little more than 1000 tokens
I wonder if I am missing any important parameter to be set
outputs = model.generate(
**inputs,
max_new_tokens=3000,
do_sample=True,
temperature=0.01,
top_p=0.9,
pad_token_id=tokenizer.eos_token_id,
)
Thanks