ragflow icon indicating copy to clipboard operation
ragflow copied to clipboard

[Bug]: Passing Status Stuck in "Task is queued..."

Open wuxuehai01 opened this issue 6 months ago • 3 comments

Self Checks

  • [x] I have searched for existing issues search for existing issues, including closed ones.
  • [x] I confirm that I am using English to submit this report (Language Policy).
  • [x] Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) (Language Policy).
  • [x] Please do not modify this template :) and fill in all the required fields.

RAGFlow workspace code commit ID

f707403

RAGFlow image version

v0.19.0

Other environment information

Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         45 bits physical, 48 bits virtual
  Byte Order:            Little Endian

PRETTY_NAME="Ubuntu 22.04 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04 (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy

Actual behavior

2025-06-16 13:48:27,050 ERROR 1091129 fetch task exception Traceback (most recent call last): File "/home/xwu/projects/ragflow-main/api/db/services/document_service.py", line 519, in update_progress info["progress_msg"] = "%d tasks are ahead in the queue..."%get_queue_length(priority) File "/home/xwu/projects/ragflow-main/api/db/services/document_service.py", line 570, in get_queue_length return int(group_info.get("lag", 0)) AttributeError: 'NoneType' object has no attribute 'get'

            priority = 0
            for t in tsks:
                if 0 <= t.progress < 1:
                    finished = False
                if t.progress == -1:
                    bad += 1
                prg += t.progress if t.progress >= 0 else 0
                if t.progress_msg.strip():
                    msg.append(t.progress_msg)
                if t.task_type == "raptor":
                    has_raptor = True
                elif t.task_type == "graphrag":
                    has_graphrag = True
                priority = max(priority, t.priority)
            prg /= len(tsks)
            if finished and bad:
                prg = -1
                status = TaskStatus.FAIL.value
            elif finished:
                if d["parser_config"].get("raptor", {}).get("use_raptor") and not has_raptor:
                    queue_raptor_o_graphrag_tasks(d, "raptor", priority)
                    prg = 0.98 * len(tsks) / (len(tsks) + 1)
                elif d["parser_config"].get("graphrag", {}).get("use_graphrag") and not has_graphrag:
                    queue_raptor_o_graphrag_tasks(d, "graphrag", priority)
                    prg = 0.98 * len(tsks) / (len(tsks) + 1)
                else:
                    status = TaskStatus.DONE.value

            msg = "\n".join(sorted(msg))
            info = {
                "process_duation": datetime.timestamp(
                    datetime.now()) -
                d["process_begin_at"].timestamp(),
                "run": status}
            if prg != 0:
                info["progress"] = prg
            if msg:
                info["progress_msg"] = msg
            else:
                info["progress_msg"] = "%d tasks are ahead in the queue..."%get_queue_length(priority)

This priority variable is "priority = 0" so that get_queue_length(0) return error

In "api/db/services/document_service.py"

def get_queue_length(priority):
    group_info = REDIS_CONN.queue_info(get_svr_queue_name(priority), SVR_CONSUMER_GROUP_NAME)
    return int(group_info.get("lag", 0))

print(get_svr_queue_name(priority), SVR_CONSUMER_GROUP_NAME) output: ('rag_flow_svr_queue', 'rag_flow_svr_task_broker')

Since the "REDIS_CONN.queue_info(get_svr_queue_name(priority), SVR_CONSUMER_GROUP_NAME)" return None, so get AttributeError: 'NoneType' object has no attribute 'get'

Expected behavior

No response

Steps to reproduce

https://ragflow.io/docs/v0.19.0/launch_ragflow_from_source

Follow the guide and upload any files to process in a new created Knowledge Base

(ragflow) xwu@vm01:~/projects/ragflow-main$ sudo docker compose -f docker/docker-compose-base.yml up
(ragflow) xwu@vm01:~/projects/ragflow-main$ python -m api.ragflow_server
(ragflow) xwu@vm01:~/projects/ragflow-main/web$ npm run dev

Additional information

.env:

# The type of doc engine to use.
# Available options:
# - `elasticsearch` (default)
# - `infinity` (https://github.com/infiniflow/infinity)
# - `opensearch` (https://github.com/opensearch-project/OpenSearch)
DOC_ENGINE=${DOC_ENGINE:-elasticsearch}

# ------------------------------
# docker env var for specifying vector db type at startup
# (based on the vector db type, the corresponding docker
# compose profile will be used)
# ------------------------------
COMPOSE_PROFILES=${DOC_ENGINE}

# The version of Elasticsearch.
STACK_VERSION=8.11.3

# The hostname where the Elasticsearch service is exposed
ES_HOST=es01

# The port used to expose the Elasticsearch service to the host machine,
# allowing EXTERNAL access to the service running inside the Docker container.
ES_PORT=1200

# The password for Elasticsearch.
ELASTIC_PASSWORD=infini_rag_flow

# the hostname where OpenSearch service is exposed, set it not the same as elasticsearch
OS_PORT=1201

# The hostname where the OpenSearch service is exposed
OS_HOST=opensearch01

# The password for OpenSearch.
# At least one uppercase letter, one lowercase letter, one digit, and one special character
OPENSEARCH_PASSWORD=infini_rag_flow_OS_01

# The port used to expose the Kibana service to the host machine,
# allowing EXTERNAL access to the service running inside the Docker container.
KIBANA_PORT=6601
KIBANA_USER=rag_flow
KIBANA_PASSWORD=infini_rag_flow

# The maximum amount of the memory, in bytes, that a specific Docker container can use while running.
# Update it according to the available memory in the host machine.
MEM_LIMIT=8073741824

# The hostname where the Infinity service is exposed
INFINITY_HOST=infinity

# Port to expose Infinity API to the host
INFINITY_THRIFT_PORT=23817
INFINITY_HTTP_PORT=23820
INFINITY_PSQL_PORT=5432

# The password for MySQL.
MYSQL_PASSWORD=infini_rag_flow
# The hostname where the MySQL service is exposed
MYSQL_HOST=mysql
# The database of the MySQL service to use
MYSQL_DBNAME=rag_flow
# The port used to expose the MySQL service to the host machine,
# allowing EXTERNAL access to the MySQL database running inside the Docker container.
MYSQL_PORT=5455

# The hostname where the MinIO service is exposed
MINIO_HOST=minio
# The port used to expose the MinIO console interface to the host machine,
# allowing EXTERNAL access to the web-based console running inside the Docker container.
MINIO_CONSOLE_PORT=9001
# The port used to expose the MinIO API service to the host machine,
# allowing EXTERNAL access to the MinIO object storage service running inside the Docker container.
MINIO_PORT=9000
# The username for MinIO.
# When updated, you must revise the `minio.user` entry in service_conf.yaml accordingly.
MINIO_USER=rag_flow
# The password for MinIO.
# When updated, you must revise the `minio.password` entry in service_conf.yaml accordingly.
MINIO_PASSWORD=infini_rag_flow

# The hostname where the Redis service is exposed
REDIS_HOST=redis
# The port used to expose the Redis service to the host machine,
# allowing EXTERNAL access to the Redis service running inside the Docker container.
REDIS_PORT=6379
# The password for Redis.
REDIS_PASSWORD=infini_rag_flow

# The port used to expose RAGFlow's HTTP API service to the host machine,
# allowing EXTERNAL access to the service running inside the Docker container.
SVR_HTTP_PORT=9380


# The local time zone.
TIMEZONE='Asia/Shanghai'


REGISTER_ENABLED=1

docker/service_conf.yaml.template:

ragflow:
  host: ${RAGFLOW_HOST:-0.0.0.0}
  http_port: 9380
mysql:
  name: '${MYSQL_DBNAME:-rag_flow}'
  user: '${MYSQL_USER:-root}'
  password: '${MYSQL_PASSWORD:-infini_rag_flow}'
  host: '${MYSQL_HOST:-mysql}'
  port: ${MYSQL_PORT:-5455}
  max_connections: 900
  stale_timeout: 300
minio:
  user: '${MINIO_USER:-rag_flow}'
  password: '${MINIO_PASSWORD:-infini_rag_flow}'
  host: '${MINIO_HOST:-minio}:9000'
es:
  hosts: 'http://${ES_HOST:-es01}:${ES_PORT:-1200}'
  username: '${ES_USER:-elastic}'
  password: '${ELASTIC_PASSWORD:-infini_rag_flow}'
os:
  hosts: 'http://${OS_HOST:-opensearch01}:9201'
  username: '${OS_USER:-admin}'
  password: '${OPENSEARCHH_PASSWORD:-infini_rag_flow_OS_01}'
infinity:
  uri: '${INFINITY_HOST:-infinity}:23817'
  db_name: 'default_db'
redis:
  db: 1
  password: '${REDIS_PASSWORD:-infini_rag_flow}'
  host: '${REDIS_HOST:-redis}:6379'


No response

wuxuehai01 avatar Jun 16 '25 06:06 wuxuehai01

This issue is caused by the get_queue_length function in document_service.py, which tries to access group_info.get("lag", 0) without checking if group_info is None. This happens when the queue info lookup fails, often due to missing or uninitialized queue state, leading to the AttributeError you saw. This bug can cause the progress to get stuck at "Task is queued..." even if tasks are being processed.

This specific bug has been fixed in the nightly version of the RagFlow docker image—updating to the nightly image should resolve the stuck status and prevent the AttributeError [source]. If you are running from source, you may want to add a defensive check in get_queue_length to handle None values for group_info.

Also, make sure your task executor is running and not stuck or out of memory, as the "Task is queued" status can also appear if the executor is unavailable [source].

If you have customized your docker-compose to use MariaDB instead of MySQL, be aware this can introduce runtime bugs and inaccurate progress reporting [source].

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Join Discord Share on X

dosubot[bot] avatar Jun 16 '25 06:06 dosubot[bot]

should be your queue (redis) meet some exceptions def queue_info(self, queue, group_name) -> dict | None: try: groups = self.REDIS.xinfo_groups(queue) for group in groups: if group["name"] == group_name: return group except Exception as e: logging.warning( "RedisDB.queue_info " + str(queue) + " got exception: " + str(e) ) return None

Woody-Hu avatar Jun 16 '25 08:06 Woody-Hu

should be your queue (redis) meet some exceptions def queue_info(self, queue, group_name) -> dict | None: try: groups = self.REDIS.xinfo_groups(queue) for group in groups: if group["name"] == group_name: return group except Exception as e: logging.warning( "RedisDB.queue_info " + str(queue) + " got exception: " + str(e) ) return None

redis-cli -h redis -p 6379 -a infini_rag_flow PING

I have checked redis and return PONG

Run redis test:

xwu@vm01:~/projects$ redis-cli -h 127.0.0.1 -p 6379 -a infini_rag_flow SET test_key "hello"
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
OK
xwu@vm01:~/projects$ redis-cli -h 127.0.0.1 -p 6379 -a infini_rag_flow GET test_key
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
"hello"

wuxuehai01 avatar Jun 16 '25 08:06 wuxuehai01

Did not run the task executor:


JEMALLOC_PATH=$(pkg-config --variable=libdir jemalloc)/libjemalloc.so;
LD_PRELOAD=$JEMALLOC_PATH python rag/svr/task_executor.py 1;

wuxuehai01 avatar Jun 19 '25 02:06 wuxuehai01

你好 请问问题解决了吗 可以提供一下源码启动的解决方法吗

LXCTXDY avatar Jun 23 '25 01:06 LXCTXDY

你好 请问问题解决了吗 可以提供一下源码启动的解决方法吗

My problem was that I didn't start the task executor as required, so all the displays were still executing. It had nothing to do with redis. I just started all four services according to https://ragflow.io/docs/v0.19.0/launch_ragflow_from_source

wuxuehai01 avatar Jun 23 '25 02:06 wuxuehai01

你好 请问问题解决了吗 可以提供一下源码启动的解决方法吗

My problem was that I didn't start the task executor as required, so all the displays were still executing. It had nothing to do with redis. I just started all four services according to https://ragflow.io/docs/v0.19.0/launch_ragflow_from_source

好的 谢谢你

LXCTXDY avatar Jun 23 '25 02:06 LXCTXDY