TaskWeaver icon indicating copy to clipboard operation
TaskWeaver copied to clipboard

JSONError: incomplete JSON str ends prematurely [Model: Qwen-72B-Chat served via FastChat]

Open Haxeebraja opened this issue 1 year ago • 2 comments
trafficstars

I am getting the following error on Qwen-72B-Chat served via FastChat. It occurs when response is bit long. However context length of Qwen 72b is 32k which should be enough. Is there a parameter in fastchat or taskweaver to set this or is the below error due to any other reason?

Short responses are working fine. for example printing number 1 - 100. However, printing number 1 to 500 fail with this error:

2024-01-25 11:41:44 - Use back up engine: False 2024-01-25 11:41:46 - LLM output: {" 2024-01-25 11:41:46 - Failed to parse LLM output stream due to JSONError: incomplete JSON str ends prematurely 2024-01-25 11:41:46 - Traceback (most recent call last): File "/workspace/mounted/TaskWeaver/playground/UI/../../taskweaver/session/session.py", line 134, in _send_text_message post = _send_message(post.send_to, post) File "/workspace/mounted/TaskWeaver/playground/UI/../../taskweaver/session/session.py", line 108, in _send_message reply_post = self.planner.reply( File "/workspace/mounted/TaskWeaver/playground/UI/../../taskweaver/planner/planner.py", line 263, in reply self.planner_post_translator.raw_text_to_post( File "/workspace/mounted/TaskWeaver/playground/UI/../../taskweaver/role/translator.py", line 118, in raw_text_to_post validation_func(post_proxy.post) File "/workspace/mounted/TaskWeaver/playground/UI/../../taskweaver/planner/planner.py", line 229, in check_post_validity post.attachment_list[0].type == AttachmentType.init_plan IndexError: list index out of range

Haxeebraja avatar Jan 25 '24 12:01 Haxeebraja

It looks like the output of the LLM is "{" as displayed in the logs. If this is a persistant issue, the issue may lies in parsing the output of the LLM from the serving platform.

liqul avatar Jan 26 '24 02:01 liqul

It looks like the output of the LLM is "{" as displayed in the logs. If this is a persistant issue, the issue may lies in parsing the output of the LLM from the serving platform.

Might be possible but it occurs only with long outputs. Is there a way to set max_new_tokens in Taskweaver?

Haxeebraja avatar Jan 27 '24 18:01 Haxeebraja

The issue was resolved by increasing the allocated GPU memory for the model in FastChat

Haxeebraja avatar Jan 30 '24 06:01 Haxeebraja