dify After upgrading from v0.5.3 to v1.0.0, the knowledge base cannot parse and vectorize documents, and an error is reported directly.

Self Checks

[x] This is only for bug report, if you would like to ask a question, please head to Discussions.
[x] I have searched for existing issues search for existing issues, including closed ones.
[x] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[x] [FOR CHINESE USERS] 请务必使用英文提交 Issue，否则会被关闭。谢谢！:）
[x] Please do not modify this template :) and fill in all the required fields.

Dify version

v1.0.0

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

After upgrading from v0.5.3 to v1.0.0, the knowledge base cannot parse and vectorize documents, and an error is reported directly. Upgrade process: first back up the volumes directory, then download the new project files from github, update the three image files langgenius/dify-web:1.0.0 docker-api-1 langgenius/dify-plugin-daemon:0.0.3-local, and then docker-compose down docker-compose up -d

Then execute in sequence: poetry run flask extract-plugins --workers=20 poetry run flask db upgrade poetry run flask migrate-data-for-plugin A total of 2 servers were upgraded, of which server A could not download the plug-in, and the installation package was manually downloaded from the official website and uploaded and downloaded locally Server B can be installed directly online. The current problem is that the knowledge base of server A cannot parse the newly added documents, while server B is normal In addition, there is a problem with both servers, that is, after the documents in the knowledge base are archived, they can no longer be revoked, and revocation reports an error

The following is the log of the docker-worker-1 container of server A: 2025-03-04 13:32:53.391 INFO [MainThread] [connection.py:22] - Connected to redis://:@redis:6379/1 2025-03-04 13:32:53.394 INFO [MainThread] [mingle.py:40] - mingle: searching for neighbors 2025-03-04 13:32:54.401 INFO [MainThread] [mingle.py:49] - mingle: all alone 2025-03-04 13:32:54.412 INFO [MainThread] [worker.py:175] - celery@82791388ba25 ready. 2025-03-04 13:32:54.414 INFO [Dummy-1] [pidbox.py:111] - pidbox: Connected to redis://:@redis:6379/1. 2025-03-04 13:35:40.862 INFO [MainThread] [strategy.py:161] - Task tasks.document_indexing_task.document_indexing_task[96b7176c-40d8-41ef-8cab-e65f6ae466f3] received 2025-03-04 13:35:40.898 INFO [Dummy-2] [document_indexing_task.py:59] - Start process document: 38cc0d19-97e6-431c-a615-76beeaa15b07 2025-03-04 13:35:40.903 INFO [Dummy-2] [document_indexing_task.py:59] - Start process document: 5025fdf0-98f0-4665-b1ce-2f8887521495 2025-03-04 13:39:41.024 ERROR [Dummy-2] [indexing_runner.py:96] - consume document failed Traceback (most recent call last): File "/app/api/core/indexing_runner.py", line 73, in run documents = self._transform(

File "/app/api/core/indexing_runner.py", line 706, in _transform documents = index_processor.transform(

File "/app/api/core/rag/index_processor/processor/parent_child_index_processor.py", line 56, in transform document_nodes = splitter.split_documents([document])

File "/app/api/core/rag/splitter/text_splitter.py", line 96, in split_documents return self.create_documents(texts, metadatas=metadatas)

File "/app/api/core/rag/splitter/text_splitter.py", line 81, in create_documents for chunk in self.split_text(text):

File "/app/api/core/rag/splitter/fixed_text_splitter.py", line 68, in split_text chunks_lengths = self._length_function(chunks)

File "/app/api/core/rag/splitter/fixed_text_splitter.py", line 38, in _token_encoder return embedding_model_instance.get_text_embedding_num_tokens(texts=texts)

File "/app/api/core/model_manager.py", line 244, in get_text_embedding_num_tokens self._round_robin_invoke( File "/app/api/core/model_manager.py", line 370, in _round_robin_invoke return function(*args, **kwargs)

File "/app/api/core/model_runtime/model_providers/__base/text_embedding_model.py", line 65, in get_num_tokens return plugin_model_manager.get_text_embedding_num_tokens(

File "/app/api/core/plugin/manager/model.py", line 313, in get_text_embedding_num_tokens for resp in response:

File "/app/api/core/plugin/manager/base.py", line 189, in _request_with_plugin_daemon_response_stream self._handle_plugin_daemon_error(error.error_type, error.message) File "/app/api/core/plugin/manager/base.py", line 223, in _handle_plugin_daemon_error raise PluginDaemonInternalServerError(description=message) core.plugin.manager.exc.PluginDaemonInternalServerError: PluginDaemonInternalServerError: killed by timeout 2025-03-04 13:39:41.062 WARNING [Dummy-2] [warnings.py:112] - /app/api/.venv/lib/python3.12/site-packages/pypdfium2/_helpers/textpage.py:80: UserWarning: get_text_range() call with default params will be implicitly redirected to get_text_bounded() warnings.warn("get_text_range() call with default params will be implicitly redirected to get_text_bounded()")

2025-03-04 13:43:41.368 ERROR [Dummy-2] [indexing_runner.py:96] - consume document failed Traceback (most recent call last): File "/app/api/core/indexing_runner.py", line 73, in run documents = self._transform(