ragflow icon indicating copy to clipboard operation
ragflow copied to clipboard

[Question]: AssertionError: 5aaf96aa5fc311ef8e3c401a58bf4f84 empty task! when trying to parse data

Open Said-Apollo opened this issue 1 year ago • 2 comments

Describe your problem

I have in total 5 knowledgebases. When uploading the documents to the first 4, everything worked perfectly fine. However, when uploading documents for the last knowledge base, suddenly nothing works anymore (at least not the parsing). Looking closer, I noticed the following warning. Traceback (most recent call last): File "/home/said/ragflow-test/rag/svr/task_executor.py", line 375, in <module> main() File "/home/said/ragflow-test/rag/svr/task_executor.py", line 294, in main rows = collect() ^^^^^^^^^ File "/home/said/ragflow-test/rag/svr/task_executor.py", line 117, in collect assert tasks, "{} empty task!".format(msg["id"]) ^^^^^ AssertionError: 5aaf96aa5fc311ef8e3c401a58bf4f84 empty task! [WARNING] Load term.freq FAIL! [WARNING] [2024-08-21 16:08:56,563] [synonym.__init__] [line:40]: Realtime synonym is disabled, since no redis connection.

This issue was already mentioned in issues #1108 #1846 and #1091 . However, there is no clear way on how to fix that error or why it even occurs in the first place

I deleted the last knowledgebase again, as well as the docker container and the respective images. However, nothing really works.

Here my entrypoint.sh file, which calls the task_executor (it was slightly changed) a few weeks ago) `# Set the Python path export PYTHONPATH=/home/said/ragflow-test

Set the Python executable PY=/home/said/anaconda3/envs/ragflow/bin/python

Fallback to system Python if the specific Python is not found if ! command -v $PY &> /dev/null; then PY=python3 fi

Default worker count to 1 if not set or invalid if [[ -z "$WS" || $WS -lt 1 ]]; then WS=1 fi

Define the task executor function function task_exe(){ while true; do $PY rag/svr/task_executor.py # Optional: Add a small sleep to prevent tight loop in case of persistent failure sleep 1 done }

Trap SIGINT and SIGTERM to properly terminate background jobs trap 'kill $(jobs -p)' SIGINT SIGTERM

Start the task executors in the background for ((i=0; i<WS; i++)); do task_exe & done

Start the main server process while true; do $PY api/ragflow_server.py # Optional: Add a small sleep to prevent tight loop in case of persistent failure sleep 1 done

Wait for all background jobs to finish wait`

Said-Apollo avatar Aug 21 '24 14:08 Said-Apollo

A record of task will be in redis and mysql at the same time. If a task from redis does not exist in mysql, it will assert. It's usually caused by redo parsing or restarting of the system, and could be ignored at most of the time.

KevinHuSh avatar Aug 22 '24 01:08 KevinHuSh

Fixed by #2094

JinHai-CN avatar Aug 26 '24 06:08 JinHai-CN