ollama-js icon indicating copy to clipboard operation
ollama-js copied to clipboard

cause: HeadersTimeoutError: Headers Timeout Error

Open smalik2043 opened this issue 11 months ago • 10 comments

I keep on getting cause: HeadersTimeoutError: Headers Timeout Error when I am trying to request homellm model

cause: HeadersTimeoutError: Headers Timeout Error
       at Timeout.onParserTimeout [as callback] (node:internal/deps/undici/undici:8228:32)
       at Timeout.onTimeout [as _onTimeout] (node:internal/deps/undici/undici:6310:17)
       at listOnTimeout (node:internal/timers:573:17)
       at processTimers (node:internal/timers:514:7) {
     code: 'UND_ERR_HEADERS_TIMEOUT'
   }

Sometimes

 Error: Expected a completed response.
     at Ollama.processStreamableRequest (/usr/src/app/node_modules/ollama/dist/shared/ollama.a247cdd6.cjs:211:15)
     at processTicksAndRejections (node:internal/process/task_queues:95:5)
     at OllamaService.responseFromOllama (/usr/src/app/src/ollama/ollama.service.ts:21:24)
     at AppGateway.pubSubMessageAI (/usr/src/app/src/gateway/app.gateway.ts:98:18)
     at AppGateway.<anonymous> (/usr/src/app/node_modules/@nestjs/websockets/context/ws-proxy.js:11:32)
     at WebSocketsController.pickResult (/usr/src/app/node_modules/@nestjs/websockets/web-sockets-controller.js:91:24)
try {
      const { customerId, message } = data;
      let systemPromptName;
      systemPromptName = data.systemPromptName || 'Al';

      const ollama = new Ollama({ host: 'http://host.docker.internal:11434' });

      const response = await ollama.generate({
        model: 'homellm:latest',
        prompt: message,
        format: 'json',
        stream: false,
        system: `You are 'Al', a helpful AI Assistant that controls the devices in a house. Complete the following task as instructed with the information provided only.
        Services: light.turn_off(), light.turn_on(brightness,rgb_color), fan.turn_on(), fan.turn_off()
        Devices:
        light.office 'Office Light' = on;80%
        fan.office 'Office fan' = off
        light.kitchen 'Kitchen Light' = on;80%;red
        light.bedroom 'Bedroom Light' = off`,
      });
      return response;
    } catch (e) {
      console.log(e);
    }
  }

I was getting responses before but now always getting Headers Timeout error or Expected a completed response.

smalik2043 avatar Mar 27 '24 06:03 smalik2043

Facing a similar issue

muditjaju avatar Apr 06 '24 05:04 muditjaju

Similar issue here. After a month of usage I haven't encountered this issue until recently. Any known solutions?

TypeError: fetch failed
      at Object.fetch (node:internal/deps/undici/undici:14062:11) {
    cause: HeadersTimeoutError: Headers Timeout Error
        at Timeout.onParserTimeout [as callback] (...\node_modules\undici\lib\client.js:1059:28)
        at Timeout.onTimeout [as _onTimeout] (...\node_modules\undici\lib\timers.js:20:13)
        at listOnTimeout (node:internal/timers:564:17)
        at process.processTimers (node:internal/timers:507:7) {
      code: 'UND_ERR_HEADERS_TIMEOUT'
    }
}

dustinnnnnn avatar Apr 07 '24 10:04 dustinnnnnn

this happens to me when format: json and only with some models.

knoopx avatar Apr 18 '24 17:04 knoopx

Happens to me as well! I am using llama3 and it only happens when I send a long message.

This happens only when I use the npm package. I have llama3 installed in my machine and use ollama to run it and it works for the same message.

How to solve this?

bralca avatar May 03 '24 20:05 bralca

Googling the problem got me here. Same issue using Ollama3 on an M2 Ultra Mac Studio

isbkch avatar May 06 '24 01:05 isbkch

same with llama3

OB42 avatar May 11 '24 19:05 OB42

Same issue. I have investigated a bit and it seems like this may be an issue with Ollama itself. I checked the server logs.

I'm able to see the pull request

May 17 15:40:36 pop-os ollama[746628]: [GIN] 2024/05/17 - 15:40:36 | 500 |          5m0s |       127.0.0.1 | POST     "/api/pull"

It stops at exactly 5 minutes, which cannot be a coincidence. Either the client times out, or the server times out.

After digging some more, I found this

https://github.com/ollama/ollama/blob/7e1e0086e7d18c943ff403a7ca5c2d9ce39f3f4b/server/routes.go#L57C5-L57C27

The session duration in Ollama is 5 minutes. Sooooo... Don't believe this is an issue with this library per-say. Either this library handles a retry, or we ask Ollama to increse this session time. Whichever is easier.

pelletier197 avatar May 17 '24 06:05 pelletier197

I've opened this issue on their side. Let's see what they say.

pelletier197 avatar May 17 '24 06:05 pelletier197

Same issue. I have investigated a bit and it seems like this may be an issue with Ollama itself. I checked the server logs.

I'm able to see the pull request

May 17 15:40:36 pop-os ollama[746628]: [GIN] 2024/05/17 - 15:40:36 | 500 |          5m0s |       127.0.0.1 | POST     "/api/pull"

It stops at exactly 5 minutes, which cannot be a coincidence. Either the client times out, or the server times out.

After digging some more, I found this

https://github.com/ollama/ollama/blob/7e1e0086e7d18c943ff403a7ca5c2d9ce39f3f4b/server/routes.go#L57C5-L57C27

The session duration in Ollama is 5 minutes. Sooooo... Don't believe this is an issue with this library per-say. Either this library handles a retry, or we ask Ollama to increse this session time. Whichever is easier.

Looks like ollama will lookup the environment variable OLLAMA_KEEP_ALIVE and convert it to default duration https://github.com/ollama/ollama/blob/7e1e0086e7d18c943ff403a7ca5c2d9ce39f3f4b/server/routes.go#L317

change OLLAMA_KEEP_ALIVE maybe works.

CliffHan avatar May 30 '24 02:05 CliffHan

This error is client side, it has nothing to do with any session timeout on the ollama server. This error is from the fetch function itself giving a timeout because it didn't receive any response from the requested server after 5 minutes.

fetch() is a standard. The standard doesn't say anything about timeouts. It's up to implementers to pick reasonable defaults. source

in my case this issue only appeared when not streaming or when other requests had to be processed first, since without streaming the server doesn't send anything until the entire response is ready.

i only tested it on nodejs, browsers are probably behaving differently

easy solution is to just use another fetch implementation

I tested node-fetch and undici with a request taking over 5 minutes to complete. undici also raises the exact same error, while node-fetch doesn't have this timeout (or it is way higher).

import { Ollama } from "ollama"
import fetch from "node-fetch"

const ollama = new Ollama({
  fetch: fetch as any
})

The as any is only necessary in typescript and has to be there because the fetch method signatures don't match exactly.

To test I started a big request on an underpowered machine and logged start and end time:

start 2024-06-29T20:06:47.856Z
end 2024-06-29T20:24:20.325Z

about 18 minutes request time on node without streaming and without headers timeout error.

be advised in this case changing to node-fetch broke the streaming requests

programminghoch10 avatar Jun 29 '24 20:06 programminghoch10