chat-ui icon indicating copy to clipboard operation
chat-ui copied to clipboard

Summarization issue

Open cfregly opened this issue 1 year ago • 4 comments

Any tips on debugging this? which model is used to summarize? and i assume this is called by the UI to summarize the convo for the left-nav?

Currently, my convo titles are all Untitled 1, Untitled 2, etc

InferenceOutputError: Invalid inference output: Expected Array<{generated_text: string}>. Use the 'request' method with the same parameters to do a custom call with no type checking.
    at Proxy.textGeneration (file:///home/ubuntu/chat/node_modules/@huggingface/inference/dist/index.mjs:460:11)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async Module.generateFromDefaultEndpoint (/home/ubuntu/chat/src/lib/server/generateFromDefaultEndpoint.ts:22:28)
    at async POST (/home/ubuntu/chat/src/routes/conversation/[id]/summarize/+server.ts:30:26)
    at async Module.render_endpoint (/home/ubuntu/chat/node_modules/@sveltejs/kit/src/runtime/server/endpoint.js:47:20)
    at async resolve (/home/ubuntu/chat/node_modules/@sveltejs/kit/src/runtime/server/respond.js:388:17)
    at async Object.handle (/home/ubuntu/chat/src/hooks.server.ts:66:20)
    at async Module.respond (/home/ubuntu/chat/node_modules/@sveltejs/kit/src/runtime/server/respond.js:259:20)
    at async file:///home/ubuntu/chat/node_modules/@sveltejs/kit/src/exports/vite/dev/index.js:506:22

cfregly avatar Jun 05 '23 04:06 cfregly

Here's a similar error:

Error: Could not parse generated text
    at parseGeneratedText (/home/ubuntu/chat/src/routes/conversation/[id]/+server.ts:186:11)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async saveMessage (/home/ubuntu/chat/src/routes/conversation/[id]/+server.ts:95:26)
InferenceOutputError: Invalid inference output: Expected Array<{generated_text: string}>. Use the 'request' method with the same parameters to do a custom call with no type checking.
    at Proxy.textGeneration (file:///home/ubuntu/chat/node_modules/@huggingface/inference/dist/index.mjs:460:11)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async Module.generateFromDefaultEndpoint (/home/ubuntu/chat/src/lib/server/generateFromDefaultEndpoint.ts:22:28)
    at async POST (/home/ubuntu/chat/src/routes/conversation/[id]/summarize/+server.ts:30:26)
    at async Module.render_endpoint (/home/ubuntu/chat/node_modules/@sveltejs/kit/src/runtime/server/endpoint.js:47:20)
    at async resolve (/home/ubuntu/chat/node_modules/@sveltejs/kit/src/runtime/server/respond.js:388:17)
    at async Object.handle (/home/ubuntu/chat/src/hooks.server.ts:66:20)
    at async Module.respond (/home/ubuntu/chat/node_modules/@sveltejs/kit/src/runtime/server/respond.js:259:20)
    at async file:///home/ubuntu/chat/node_modules/@sveltejs/kit/src/exports/vite/dev/index.js:506:22

cfregly avatar Jun 05 '23 04:06 cfregly

Running into the same issue.

psinger avatar Jun 12 '23 09:06 psinger

Same for me.

JulianGerhard21 avatar Jun 12 '23 13:06 JulianGerhard21

I believe it has to do that it calls external HF library api code, instead of the local one

psinger avatar Jun 12 '23 13:06 psinger

@psinger - to me it seems that the summarize function wants to call the default url (generate_stream) but is not prepared for its return type (stream) hence throws an exception. I have manipulated generateFromDefaultEndpoint such that it uses the generate endpoint which produces:

{
  generated_text: 'Du kannst versuchen, regelmäßig Sport zu treiben und gesunde Lebensmittel zu essen.'
}

Now I try to fix the fact that parseGeneratedText has a problem decomposing that JSON.

JulianGerhard21 avatar Jun 13 '23 07:06 JulianGerhard21

Yes, the issue is that summarize is calling external HF node library code via textGeneration, not the same as the conversational code does. And the same is done for the web search code.

I fixed it by hard-coding manual api request - probably similar to what you did.

psinger avatar Jun 13 '23 08:06 psinger

Yes, the issue is that summarize is calling external HF node library code via textGeneration, not the same as the conversational code does. And the same is done for the web search code.

I fixed it by hard-coding manual api request - probably similar to what you did.

@psinger - Would you be willing to show your fix such that I can take a look upon it? No problem if not.

JulianGerhard21 avatar Jun 13 '23 09:06 JulianGerhard21

I also encountered the same problem. Is there any solution please?

chenglong0313 avatar Jun 13 '23 09:06 chenglong0313

@JulianGerhard21 I am doing it along these lines:

try {
  const response = await fetch(url, {
	  method: "POST",
	  headers: {
		  "Content-Type": "application/json",
		  Authorization: endpoint.authorization,
	  },
	  body: JSON.stringify(data),
  });
  const resp = await response.json()
  const generated_text = trimSuffix(trimPrefix(resp.generated_text, "<|startoftext|>"), PUBLIC_SEP_TOKEN);
  
  return generated_text

} catch (error) {
  console.error(error);
  return ""
}

Instead of:

let { generated_text } = await textGeneration(
  {
	  model: url,
	  inputs: prompt,
	  parameters: newParameters,
  },
  {
	  fetch: (url, options) =>
		  fetch(url, {
			  ...options,
			  headers: { ...options?.headers, Authorization: endpoint.authorization },
		  }),
  }
);

You also need to use generate here instead of the streaming.

Issue is that by default the code calls external HF JS library function textGeneration not suited for the local endpoints.

psinger avatar Jun 15 '23 15:06 psinger

Any update on this?

cfregly avatar Jun 16 '23 21:06 cfregly

After mucking around a bit I found a temporary fix. The error occurs because the summary request calls the streaming endpoint instead of the async endpoint. So this function throws an error.

Say you have text-generation-inference running on http://1.1.1.1:8080.

in src/lib/server/generateFromDefaultEndpoint.ts, change

{
    model: endpoint.url,
    inputs: prompt,
    parameters: newParameters,
},

to

{
    model: `http://1.1.1.1:8080`,
    inputs: prompt,
    parameters: newParameters,
},

Note the url is NOT http://1.1.1.1:8080/generate or http://1.1.1.1:8080/generate_stream!

seongminp avatar Jul 05 '23 05:07 seongminp

After mucking around a bit I found a temporary fix. The error occurs because the summary request calls the streaming endpoint instead of the async endpoint. So this function throws an error.

Say you have text-generation-inference running on http://1.1.1.1:8080.

in src/lib/server/generateFromDefaultEndpoint.ts, change

{
    model: endpoint.url,
    inputs: prompt,
    parameters: newParameters,
},

to

{
    model: `http://1.1.1.1:8080`,
    inputs: prompt,
    parameters: newParameters,
},

Note the url is NOT http://1.1.1.1:8080/generate or http://1.1.1.1:8080/generate_stream!

I've tried this and it worked. But I am a bit confused as why the URL works if there is no /generate ~or /generate_stream~.

maziyarpanahi avatar Jul 16 '23 09:07 maziyarpanahi

I've opened a PR which fixes this for an TGI endpoint. https://github.com/huggingface/chat-ui/pull/355

AndreasMadsen avatar Jul 19 '23 19:07 AndreasMadsen

After mucking around a bit I found a temporary fix. The error occurs because the summary request calls the streaming endpoint instead of the async endpoint. So this function throws an error. Say you have text-generation-inference running on http://1.1.1.1:8080. in src/lib/server/generateFromDefaultEndpoint.ts, change

{
    model: endpoint.url,
    inputs: prompt,
    parameters: newParameters,
},

to

{
    model: `http://1.1.1.1:8080`,
    inputs: prompt,
    parameters: newParameters,
},

Note the url is NOT http://1.1.1.1:8080/generate or http://1.1.1.1:8080/generate_stream!

I've tried this and it worked. But I am a bit confused as why the URL works if there is no /generate ~or /generate_stream~.

If the local API is implemented as set in the OpenAPI specs, / should point to generate_compat which will call generate_stream or generate depending on the request (containing stream=true or false)

piratos avatar Jul 23 '23 10:07 piratos

Hello, can you tell me if this issue is still happening with the latest main ? I think I might have improved things a bit but not sure, if someone could check with a remote endpoint that would be great.

nsarrazin avatar Aug 18 '23 07:08 nsarrazin

The bug still exists. However, I now believe it is just a documentation issue. #408 fixes the docs.

AndreasMadsen avatar Aug 18 '23 15:08 AndreasMadsen