[Question]: AssertionError: 5aaf96aa5fc311ef8e3c401a58bf4f84 empty task! when trying to parse data
Describe your problem
I have in total 5 knowledgebases. When uploading the documents to the first 4, everything worked perfectly fine. However, when uploading documents for the last knowledge base, suddenly nothing works anymore (at least not the parsing). Looking closer, I noticed the following warning.
Traceback (most recent call last): File "/home/said/ragflow-test/rag/svr/task_executor.py", line 375, in <module> main() File "/home/said/ragflow-test/rag/svr/task_executor.py", line 294, in main rows = collect() ^^^^^^^^^ File "/home/said/ragflow-test/rag/svr/task_executor.py", line 117, in collect assert tasks, "{} empty task!".format(msg["id"]) ^^^^^ AssertionError: 5aaf96aa5fc311ef8e3c401a58bf4f84 empty task! [WARNING] Load term.freq FAIL! [WARNING] [2024-08-21 16:08:56,563] [synonym.__init__] [line:40]: Realtime synonym is disabled, since no redis connection.
This issue was already mentioned in issues #1108 #1846 and #1091 . However, there is no clear way on how to fix that error or why it even occurs in the first place
I deleted the last knowledgebase again, as well as the docker container and the respective images. However, nothing really works.
Here my entrypoint.sh file, which calls the task_executor (it was slightly changed) a few weeks ago) `# Set the Python path export PYTHONPATH=/home/said/ragflow-test
Set the Python executable PY=/home/said/anaconda3/envs/ragflow/bin/python
Fallback to system Python if the specific Python is not found if ! command -v $PY &> /dev/null; then PY=python3 fi
Default worker count to 1 if not set or invalid if [[ -z "$WS" || $WS -lt 1 ]]; then WS=1 fi
Define the task executor function function task_exe(){ while true; do $PY rag/svr/task_executor.py # Optional: Add a small sleep to prevent tight loop in case of persistent failure sleep 1 done }
Trap SIGINT and SIGTERM to properly terminate background jobs trap 'kill $(jobs -p)' SIGINT SIGTERM
Start the task executors in the background for ((i=0; i<WS; i++)); do task_exe & done
Start the main server process while true; do $PY api/ragflow_server.py # Optional: Add a small sleep to prevent tight loop in case of persistent failure sleep 1 done
Wait for all background jobs to finish wait`
A record of task will be in redis and mysql at the same time. If a task from redis does not exist in mysql, it will assert. It's usually caused by redo parsing or restarting of the system, and could be ignored at most of the time.
Fixed by #2094