db connection error when embedding task takes a long time
Self Checks
- [x] I have read the Contributing Guide and Language Policy.
- [x] This is only for bug report, if you would like to ask a question, please head to Discussions.
- [x] I have searched for existing issues search for existing issues, including closed ones.
- [x] I confirm that I am using English to submit this report, otherwise it will be closed.
- [x] 【中文用户 & Non English User】请使用英语提交,否则会被关闭 :)
- [x] Please do not modify this template :) and fill in all the required fields.
Dify version
1.9.1
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
When I uploaded a CSV file with approximately 24,000 rows and used newline characters as delimiters for segmentation, the embedding process took a long time and eventually failed with a database connection error. However, when I split the same file into 10 smaller files (each containing about 2,400 rows), the process completed successfully without any issues.
The error report is as follows:
2025-11-05 08:22:03,749.749 INFO [ThreadPoolExecutor-10_0] [vector_factory.py:194] - Embedding batch 3/3454 took 138.00466442108154 s
2025-11-05 08:22:04,497.497 INFO [ThreadPoolExecutor-10_0] [vector_factory.py:198] - Embedding 2455 texts took 2888.9533185958862 s
2025-11-05 08:22:04,888.888 ERROR [Dummy-33] [indexing_runner.py:97] - consume document failed
Traceback (most recent call last):
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 2118, in _exec_insertmany_context
dialect.do_execute(
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/default.py", line 951, in do_execute
cursor.execute(statement, parameters)
File "/app/api/.venv/lib/python3.12/site-packages/psycogreen/gevent.py", line 32, in gevent_wait_callback
state = conn.poll()
^^^^^^^^^^^
psycopg2.OperationalError: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/app/api/core/indexing_runner.py", line 81, in run
self._load(
File "/app/api/core/indexing_runner.py", line 571, in _load
tokens += future.result()
^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/concurrent/futures/_base.py", line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/usr/local/lib/python3.12/concurrent/futures/thread.py", line 59, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/api/core/indexing_runner.py", line 630, in _process_chunk
index_processor.load(dataset, chunk_documents, with_keywords=False)
File "/app/api/core/rag/index_processor/processor/paragraph_index_processor.py", line 82, in load
vector.create(documents)
File "/app/api/core/rag/datasource/vdb/vector_factory.py", line 193, in create
batch_embeddings = self._embeddings.embed_documents([document.page_content for document in batch])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/api/core/rag/embedding/cached_embedding.py", line 103, in embed_documents
raise ex
File "/app/api/core/rag/embedding/cached_embedding.py", line 97, in embed_documents
db.session.commit()
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/scoping.py", line 599, in commit
return self._proxied.commit()
^^^^^^^^^^^^^^^^^^^^^^
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 2032, in commit
trans.commit(_to_root=True)
File "<string>", line 2, in commit
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/state_changes.py", line 137, in _go
ret_value = fn(self, *arg, **kw)
^^^^^^^^^^^^^^^^^^^^
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 1313, in commit
self._prepare_impl()
File "<string>", line 2, in _prepare_impl
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/state_changes.py", line 137, in _go
ret_value = fn(self, *arg, **kw)
^^^^^^^^^^^^^^^^^^^^
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 1288, in _prepare_impl
self.session.flush()
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 4345, in flush
self._flush(objects)
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 4480, in _flush
with util.safe_reraise():
^^^^^^^^^^^^^^^^^^^
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/util/langhelpers.py", line 224, in __exit__
raise exc_value.with_traceback(exc_tb)
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 4441, in _flush
flush_context.execute()
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/unitofwork.py", line 466, in execute
rec.execute(self)
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/unitofwork.py", line 642, in execute
util.preloaded.orm_persistence.save_obj(
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/persistence.py", line 93, in save_obj
_emit_insert_statements(
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/persistence.py", line 1143, in _emit_insert_statements
result = connection.execute(
^^^^^^^^^^^^^^^^^^^
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1419, in execute
return meth(
^^^^^
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/sql/elements.py", line 526, in _execute_on_connection
return connection._execute_clauseelement(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1641, in _execute_clauseelement
ret = self._execute_context(
^^^^^^^^^^^^^^^^^^^^^^
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1844, in _execute_context
return self._exec_insertmany_context(dialect, context)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 2126, in _exec_insertmany_context
self._handle_dbapi_exception(
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 2355, in _handle_dbapi_exception
raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 2118, in _exec_insertmany_context
dialect.do_execute(
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/default.py", line 951, in do_execute
cursor.execute(statement, parameters)
File "/app/api/.venv/lib/python3.12/site-packages/psycogreen/gevent.py", line 32, in gevent_wait_callback
state = conn.poll()
^^^^^^^^^^^
sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
[SQL: INSERT INTO embeddings (model_name, hash, embedding, provider_name) VALUES (%(model_name)s, %(hash)s, %(embedding)s, %(provider_name)s) RETURNING embeddings.id, embeddings.created_at]
[parameters: {'model_name': 'bge-m3', 'hash': '33814e1d011f86c8b67a7300178ec27291e3121a6db01c860c64f1ba3f32cbe3', 'embedding': <psycopg2.extensions.Binary object at 0x7fc12d4be0a0>, 'provider_name': 'langgenius/xinference/xinference'}]
(Background on this error at: https://sqlalche.me/e/20/e3q8)
2025-11-05 08:22:04,895.895 ERROR [Dummy-33] [retry_document_indexing_task.py:120] - retry_document_indexing_task failed, dataset_id: 377f0818-fba6-486e-b7ec-382b339aee9b, document_ids: ['d8537b06-a879-460e-82df-fc19689765d2']
Traceback (most recent call last):
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 2118, in _exec_insertmany_context
dialect.do_execute(
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/default.py", line 951, in do_execute
cursor.execute(statement, parameters)
File "/app/api/.venv/lib/python3.12/site-packages/psycogreen/gevent.py", line 32, in gevent_wait_callback
state = conn.poll()
^^^^^^^^^^^
psycopg2.OperationalError: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/app/api/core/indexing_runner.py", line 81, in run
self._load(
File "/app/api/core/indexing_runner.py", line 571, in _load
tokens += future.result()
^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/concurrent/futures/_base.py", line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/usr/local/lib/python3.12/concurrent/futures/thread.py", line 59, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/api/core/indexing_runner.py", line 630, in _process_chunk
index_processor.load(dataset, chunk_documents, with_keywords=False)
File "/app/api/core/rag/index_processor/processor/paragraph_index_processor.py", line 82, in load
vector.create(documents)
File "/app/api/core/rag/datasource/vdb/vector_factory.py", line 193, in create
batch_embeddings = self._embeddings.embed_documents([document.page_content for document in batch])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/api/core/rag/embedding/cached_embedding.py", line 103, in embed_documents
raise ex
File "/app/api/core/rag/embedding/cached_embedding.py", line 97, in embed_documents
db.session.commit()
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/scoping.py", line 599, in commit
return self._proxied.commit()
^^^^^^^^^^^^^^^^^^^^^^
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 2032, in commit
trans.commit(_to_root=True)
File "<string>", line 2, in commit
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/state_changes.py", line 137, in _go
ret_value = fn(self, *arg, **kw)
^^^^^^^^^^^^^^^^^^^^
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 1313, in commit
self._prepare_impl()
File "<string>", line 2, in _prepare_impl
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/state_changes.py", line 137, in _go
ret_value = fn(self, *arg, **kw)
^^^^^^^^^^^^^^^^^^^^
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 1288, in _prepare_impl
self.session.flush()
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 4345, in flush
self._flush(objects)
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 4480, in _flush
with util.safe_reraise():
^^^^^^^^^^^^^^^^^^^
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/util/langhelpers.py", line 224, in __exit__
raise exc_value.with_traceback(exc_tb)
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 4441, in _flush
flush_context.execute()
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/unitofwork.py", line 466, in execute
rec.execute(self)
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/unitofwork.py", line 642, in execute
util.preloaded.orm_persistence.save_obj(
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/persistence.py", line 93, in save_obj
_emit_insert_statements(
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/orm/persistence.py", line 1143, in _emit_insert_statements
result = connection.execute(
^^^^^^^^^^^^^^^^^^^
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1419, in execute
return meth(
^^^^^
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/sql/elements.py", line 526, in _execute_on_connection
return connection._execute_clauseelement(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1641, in _execute_clauseelement
ret = self._execute_context(
^^^^^^^^^^^^^^^^^^^^^^
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1844, in _execute_context
return self._exec_insertmany_context(dialect, context)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 2126, in _exec_insertmany_context
self._handle_dbapi_exception(
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 2355, in _handle_dbapi_exception
raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 2118, in _exec_insertmany_context
dialect.do_execute(
File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/default.py", line 951, in do_execute
cursor.execute(statement, parameters)
File "/app/api/.venv/lib/python3.12/site-packages/psycogreen/gevent.py", line 32, in gevent_wait_callback
state = conn.poll()
^^^^^^^^^^^
✔️ Expected Behavior
The word embedding can be completed normally
❌ Actual Behavior
Embedding processing...
the pr: https://github.com/langgenius/dify/pull/26196 have been release with 1.9.1 , but still have this issue. I found that these changes no longer exist in the main branch.
You're correct—the changes from PR #26196 (explicitly closing the DB session before embedding to prevent timeouts) are not present in the main branch or in the 1.9.1 release, despite the PR being merged and referenced in release notes. This likely means the fix was reverted, overwritten, or not properly merged into the release branch, which matches your observation and explains why the issue persists in 1.9.1. The embedding pipeline still risks holding a long-lived DB connection during lengthy embedding tasks, leading to the PostgreSQL disconnects you’re seeing (source).
For a confirmed fix, update to Dify 1.9.2 or later, where the session management patch is included and users have reported successful resolution of this exact problem (source). If you can't upgrade, you can manually patch your deployment by ensuring the DB session is closed before starting long embedding operations, as described in PR #26196.
Let me know if you need a code snippet or more details on how to apply the workaround.
To reply, just mention @dosu.
How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other
Moreover, I've noticed that after this issue occurs, the number of database connections from Dify nearly doubles.
This doubling of database connections after an embedding failure is likely due to a connection leak in the current Dify codebase. When a long-running embedding task fails (for example, due to a timeout or database error), the code only rolls back the session but does not explicitly close or remove it. Because Dify uses a shared scoped session, failed or orphaned connections can accumulate, especially after errors in threaded or concurrent embedding tasks. This leaves connections open in the pool, which can quickly double the number of active connections after repeated failures, matching what you’re seeing (source).
A previous fix that closed the session before embedding was reverted to avoid breaking downstream code, but this also removed the safeguard against leaks (source). Until session cleanup is handled more robustly, this leak will persist after embedding errors. As a workaround, you can manually patch your deployment to ensure sessions are closed or removed after failures, or consider upgrading if a future release addresses this with better session management.
To reply, just mention @dosu.
How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other