chat-ui icon indicating copy to clipboard operation
chat-ui copied to clipboard

[web search] An error occurred with the web search "Invalid inference output: Expected Array<{generated_text: string}>. Use the 'request' method with the same parameters to do a custom call with no type checking."

Open cfregly opened this issue 2 years ago • 13 comments
trafficstars

image

cfregly avatar Jun 02 '23 22:06 cfregly

    at Proxy.textGeneration (file:///home/ubuntu/chat/node_modules/@huggingface/inference/dist/index.mjs:460:11)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async Module.generateFromDefaultEndpoint (/home/ubuntu/chat/src/lib/server/generateFromDefaultEndpoint.ts:22:28)
    at async POST (/home/ubuntu/chat/src/routes/conversation/[id]/summarize/+server.ts:30:26)
    at async Module.render_endpoint (/home/ubuntu/chat/node_modules/@sveltejs/kit/src/runtime/server/endpoint.js:47:20)
    at async resolve (/home/ubuntu/chat/node_modules/@sveltejs/kit/src/runtime/server/respond.js:388:17)
    at async Object.handle (/home/ubuntu/chat/src/hooks.server.ts:66:20)
    at async Module.respond (/home/ubuntu/chat/node_modules/@sveltejs/kit/src/runtime/server/respond.js:259:20)
    at async file:///home/ubuntu/chat/node_modules/@sveltejs/kit/src/exports/vite/dev/index.js:506:22```

cfregly avatar Jun 02 '23 22:06 cfregly

Hey! Thanks for filing a report. I'd like to look into it but I'm going to need a few more details.

  • Did you setup your own SERPAPI_KEY env variable in your .env.local ?
  • Are you using a custom MODELS env variable ?
  • If so what models are you using ?
  • What DB are you using ?
  • Does chat-ui work without the websearch ?

nsarrazin avatar Jun 03 '23 07:06 nsarrazin

  • Did you setup your own SERPAPI_KEY env variable in your .env.local ? yes, and i restarted

Are you using a custom MODELS env variable ?

yes:

MODELS=`[
  {
    "endpoints": [
        {"url": "http://127.0.0.1:8080/generate_stream", "weight": 100}
    ],
    "name": "...",
    "userMessageToken": "<|prompter|>",
    "assistantMessageToken": "<|assistant|>",
    "messageEndToken": "</s>",
    "preprompt": "Below are a series of dialogues between various people and an AI assistant. The AI tries to be helpful, polite, honest, sophisticated, emotionally aware, and humble-but-knowledgeable. The assistant is happy to help with almost anything, and will do its best to understand exactly what is needed. It also tries to avoid giving false or misleading information, and it caveats when it isn't entirely sure about the right answer. That said, the assistant is practical and really does its best, and doesn't let caution get too much in the way of being useful.\n-----\n",
    "parameters": {
      "temperature": 0.9,
      "top_p": 0.95,
      "repetition_penalty": 1.2,
      "top_k": 50,
      "truncate": 1000,
      "max_new_tokens": 1024
    }
  }
...

If so what models are you using ? LLaMA, Open Assist, Falcon

What DB are you using ?

MONGODB_DB_NAME=demo
MONGODB_URL=mongodb://127.0.0.1:27017/
MONGODB_DIRECT_CONNECTION=false

Does chat-ui work without the websearch ?

yes, but strangely, i also see the Invalid inference output error, but I still get a valid response in the UI. When i enable web search, I see the error in the UI. Hmm.

InferenceOutputError: Invalid inference output: Expected Array<{generated_text: string}>. Use the 'request' method with the same parameters to do a custom call with no type checking.
    at Proxy.textGeneration (file:///home/ubuntu/chat/node_modules/@huggingface/inference/dist/index.mjs:460:11)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async Module.generateFromDefaultEndpoint (/home/ubuntu/chat/src/lib/server/generateFromDefaultEndpoint.ts:22:28)
    at async POST (/home/ubuntu/chat/src/routes/conversation/[id]/summarize/+server.ts:30:26)
    at async Module.render_endpoint (/home/ubuntu/chat/node_modules/@sveltejs/kit/src/runtime/server/endpoint.js:47:20)
    at async resolve (/home/ubuntu/chat/node_modules/@sveltejs/kit/src/runtime/server/respond.js:388:17)
    at async Object.handle (/home/ubuntu/chat/src/hooks.server.ts:66:20)
    at async Module.respond (/home/ubuntu/chat/node_modules/@sveltejs/kit/src/runtime/server/respond.js:259:20)
    at async file:///home/ubuntu/chat/node_modules/@sveltejs/kit/src/exports/vite/dev/index.js:506:22

cfregly avatar Jun 05 '23 00:06 cfregly

docker run --gpus 4 --shm-size 1g -p 8080:80 -v $PWD/data:/data ghcr.io/huggingface/text-generation-inference:latest --model-id TheBloke/OpenAssistant-SFT-7-Llama-30B-HF --num-shard 4 --quantize bitsandbytes

cfregly avatar Jun 05 '23 00:06 cfregly

I think the error above is another issue i'm having with /summarize/server.ts - my convo titles are all "Untitled" - probably because summarization isn't working for whatever reason. I'll create an issue for that separately.

Any hints on how i can debug the Web Search issue? or is it getting stuck on the summarize/ issue above?

cfregly avatar Jun 05 '23 04:06 cfregly

The web search might be broken because it catches any error from the inference endpoint while simple answers don't. Which is why you then get answers to questions with websearch off plus a console error... (I'm just guessing here, but seems likely)

So the issue underpinning all this seems to be the Invalid inference output error you get from running your local models with text-generation-inference.

Thanks for the super detailed feedback, I'll have a deeper look at this

nsarrazin avatar Jun 05 '23 07:06 nsarrazin

any update on this or https://github.com/huggingface/chat-ui/issues/278 ?

cfregly avatar Jun 10 '23 19:06 cfregly

I have the same problem.

Did you setup your own SERPAPI_KEY env variable in your .env.local?

Yes. Also tried Serper.

Are you using a custom MODELS env variable?

Yes
MODELS=`[
  {
    "name": "OpenAI GPT-3.5",
	"description": "OpenAI's second-best performing model (ChatGPT)",
    "websiteUrl": "https://openai.com",
    "endpoints": [{"url": "http://127.0.0.1:8000/generate_stream"}],
    "userMessageToken": "User: ",
    "assistantMessageToken": "Assistant: ",
    "messageEndToken": "\n",
    "preprompt": "You are a helpful assistant named secondChat.",
    "parameters": {
        "temperature": 0.9,
        "max_new_tokens": 500,
        "truncate": 500
    }
  },
  {
    "name": "OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5",
	"displayName": "OpenAssistant",
    "description": "A good alternative to ChatGPT",
    "websiteUrl": "https://open-assistant.io",
    "datasetName": "OpenAssistant/oasst1",
    "userMessageToken": "<|prompter|>",
    "assistantMessageToken": "<|assistant|>",
    "messageEndToken": "</s>",
    "preprompt": "Below are a series of dialogues between a human and an AI assistant. The assistant is named \"secondChat\" (spelled exactly that way) and was developed by secondtruth. The AI tries to be helpful, polite, honest, sophisticated, emotionally aware, and humble-but-knowledgeable. The assistant is happy to help with almost anything, and will do its best to understand exactly what is needed. It also tries to avoid giving false or misleading information, and it caveats when it isn't entirely sure about the right answer. That said, the assistant is practical and really does its best, and doesn't let caution get too much in the way of being useful.\n-----\n",
    "promptExamples": [
      {
        "title": "Write an email from bullet list",
        "prompt": "As a restaurant owner, write a professional email to the supplier to get these products every week: \n\n- Wine (x10)\n- Eggs (x24)\n- Bread (x12)"
      }, {
        "title": "Code a snake game",
        "prompt": "Code a basic snake game in python, give explanations for each step."
      }, {
        "title": "Assist in a task",
        "prompt": "How do I make a delicious lemon cheesecake?"
      }
    ],
    "parameters": {
      "temperature": 0.9,
      "top_p": 0.95,
      "repetition_penalty": 1.2,
      "top_k": 50,
      "truncate": 500,
      "max_new_tokens": 300
    }
  }
]`

If so what models are you using?

Tried both, same result

What DB are you using?

MONGODB_URL=mongodb://localhost:27017

Does chat-ui work without the websearch?

Text generation works always both in search mode and normal mode, no error message in web UI when using normal mode, but an error always pops up in the console.

InferenceOutputError: Invalid inference output: Expected Array<{generated_text: string}>. Use the 'request' method with the same parameters to do a custom call with no type checking. at Proxy.textGeneration (file:///D:/Ablage/Projekte/Experimente/AI/huggingchat/node_modules/@huggingface/inference/dist/index.mjs:460:11) at process.processTicksAndRejections (node:internal/process/task_queues:95:5) at async Module.generateFromDefaultEndpoint (/src/lib/server/generateFromDefaultEndpoint.ts:22:28) at async POST (/src/routes/conversation/[id]/summarize/+server.ts:30:26) at async Module.render_endpoint (/node_modules/@sveltejs/kit/src/runtime/server/endpoint.js:47:20) at async resolve (/node_modules/@sveltejs/kit/src/runtime/server/respond.js:388:17) at async Object.handle (/src/hooks.server.ts:66:20) at async Module.respond (/node_modules/@sveltejs/kit/src/runtime/server/respond.js:259:20) at async file:///D:/Ablage/Projekte/Experimente/AI/huggingchat/node_modules/@sveltejs/kit/src/exports/vite/dev/index.js:506:22

secondtruth avatar Jun 25 '23 18:06 secondtruth

Temp solution:

Say you have text-generation-inference running on http://1.1.1.1:8080.

in src/lib/server/generateFromDefaultEndpoint.ts, change

{
    model: endpoint.url,
    inputs: prompt,
    parameters: newParameters,
}

to

{
    model: `http://1.1.1.1:8080`,
    inputs: prompt,
    parameters: newParameters,
}

seongminp avatar Jul 05 '23 13:07 seongminp

It seems to fix the problem, if you omit the /generate_stream in the url in the definition of the model in the .env file. This means that the configuration should e.g. look like this:

MODELS=`[
  {
    "endpoints": [
        {"url": "http://127.0.0.1:8080"}
    ],
    "name": "...",
    ...
  }
  ...
]`

CookieKlecks avatar Jul 06 '23 15:07 CookieKlecks

experiencing the same issue here when trying to connect to a custom inference model running in a separate docker on port 8000

GeorgeStrakhov avatar Jul 29 '23 15:07 GeorgeStrakhov

It seems to fix the problem, if you omit the /generate_stream in the url in the definition of the model in the .env file. This means that the configuration should e.g. look like this:

MODELS=`[
  {
    "endpoints": [
        {"url": "http://127.0.0.1:8080"}
    ],
    "name": "...",
    ...
  }
  ...
]`

This solved the problem for me - it fixed the InferenceOutputError and the web search is working.

schauppi avatar Jul 30 '23 15:07 schauppi