freq
freq
I wonder whether you need to add an additional hyperpamameter "timeout" to the following place during evaluation: completion = client.chat.completions.create( model=model, messages=[{"role": "user", "content": prompt}], temperature=temperature, max_tokens=max_new_tokens, timeout=40000 # or...
Thanks to your reply!
Have you evaluated QwQ-32B on Longbench v1? If so, are ther any adjustments to the hyperparameters in pred.py?