n8n icon indicating copy to clipboard operation
n8n copied to clipboard

LangChain in AI Beta Hangs sometimes / cannot be stopped via UI

Open amenk opened this issue 1 year ago • 6 comments

Describe the bug

When using LangChain on the AI Beta the workflow hangs, does not finish and also cannot be stopped

To Reproduce No always reproducible, but more often than not

  1. Start workflow
  2. Chain does not run through
  3. Click stop workflow
  4. All hangs, see the following GIF:

n8n-llm all hangs

Expected behavior

All works, or at least an error appears if something is wrong with the ChatGPT API / and the workflow can be stopped

Environment (please complete the following information):

docker run -it --rm --name n8nai -p 5678:5678 -e N8N_LOG_LEVEL=debug -e N8N_LOG_OUTPUT=console -v ~/.n8nai:/home/node/.n8n docker.n8n.io/n8nio/n8n:ai-beta
  • OS: 22.04
  • n8n Version Image: docker.n8n.io/n8nio/n8n ai-beta f38ede559f01 5 days ago 754MB
  • Node.js Version docker
  • Database system docker
  • Operation mode ?

Additional context

On the console I see:

2023-11-09T09:39:00.614Z [Rudder] debug: in flush
2023-11-09T09:39:00.615Z [Rudder] debug: cancelling existing timer...
2023-11-09T09:39:00.615Z [Rudder] debug: cancelling existing flushTimer...
2023-11-09T09:39:00.615Z [Rudder] debug: batch size is 3
2023-11-09T09:39:20.865Z [Rudder] debug: in flush
2023-11-09T09:39:20.865Z [Rudder] debug: cancelling existing timer...
2023-11-09T09:39:20.865Z [Rudder] debug: queue is empty, nothing to flush
2023-11-09T09:39:37.470Z | debug    | Wait tracker querying database for waiting executions "{ file: 'WaitTracker.js', function: 'getWaitingExecutions' }"
2023-11-09T09:40:37.471Z | debug    | Wait tracker querying database for waiting executions "{ file: 'WaitTracker.js', function: 'getWaitingExecutions' }"
2023-11-09T09:40:51.250Z | debug    | Proxying request to axios "{ file: 'LoggerProxy.js', function: 'exports.debug' }"
2023-11-09T09:41:37.473Z | debug    | Wait tracker querying database for waiting executions "{ file: 'WaitTracker.js', function: 'getWaitingExecutions' }"
2023-11-09T09:42:37.475Z | debug    | Wait tracker querying database for waiting executions "{ file: 'WaitTracker.js', function: 'getWaitingExecutions' }"

After pressing Ctrl+C

2023-11-09T09:44:09.902Z [Rudder] debug: batch size is 1
Waiting for 1 active executions to finish...
 - Execution ID 393, workflow ID: enkm09FeKlhfSAlB
Waiting for 1 active executions to finish...
 - Execution ID 393, workflow ID: enkm09FeKlhfSAlB

amenk avatar Nov 09 '23 09:11 amenk

PS: I got an error from the pipeline for this issue: https://github.com/n8n-io/n8n/actions/runs/6810063907

amenk avatar Nov 09 '23 09:11 amenk

@amenk Is it possible you're hitting TPM/RPM/RPD rate limits for your OpenAI org? Because model will retry 6 times with an exponential backoff between each attempt. Which could seem like it's hanging but it's waiting to re-try. I'll add a configuration option in the model for this, so you'd be able to disable/lower retries.

OlegIvaniv avatar Nov 09 '23 10:11 OlegIvaniv

@OlegIvaniv Yes I also assume that there is some limit hit at OpenAI.

  1. where can I see this?
  2. Still the workflow should be stoppable without killing the container?

amenk avatar Nov 09 '23 10:11 amenk

@amenk You can check your usage in OpenAI usage dashboard

OlegIvaniv avatar Nov 09 '23 10:11 OlegIvaniv

@OlegIvaniv Yeah, sure. But I believe I don't see the current rate limits there. Was more thinking about the response headers. Does not seem even to be displayed with the "debug" log level.

Also it was my first run today, so I should not have hit limits. But it's all guessing ;-) On the second run it worked (after restarting the docker container)--- but I saw similar issues during the last days sporadically.

The main reason I opened the issue is because of the hard-hang, I believe that should never happen, even if an external API is limiting?

amenk avatar Nov 09 '23 10:11 amenk

I found a new hint in the log

2023-11-10T11:54:28.721Z | error    | WorkflowOperationError: Only running or waiting executions can be stopped and 408 is currently crashed. "{ file: 'LoggerProxy.js', function: 'exports.error' }"

How to handle such crashed executions?

Is this considered a bug? Are there any workarounds? Can the exponential back-off somehow be limited? There is retry-on-fail configuration, but it's also happening when this is off.

Is any other log existing which could help here? I think "debug" is already the highest loglevel?

amenk avatar Nov 10 '23 11:11 amenk