celery-exporter icon indicating copy to clipboard operation
celery-exporter copied to clipboard

Bug with hyphens/dash (-) on queue names

Open thiarthur opened this issue 1 year ago • 9 comments

I have an old project using micro-services architecture, where the queues are named "{service-name}-module." When inspecting the metrics endpoints, some metrics work properly:

# TYPE celery_active_consumer_count gauge
celery_active_consumer_count{queue_name="data-manager-search"} 1.0
celery_active_consumer_count{queue_name="data-manager"} 1.0
celery_active_consumer_count{queue_name="data-manager-storage"} 1.0
# HELP celery_active_worker_count The number of active workers in broker queue.
# TYPE celery_active_worker_count gauge
celery_active_worker_count{queue_name="data-manager-search"} 1.0
celery_active_worker_count{queue_name="data-manager"} 1.0
celery_active_worker_count{queue_name="data-manager-storage"} 1.0
# HELP celery_active_process_count The number of active processes in broker queue.
# TYPE celery_active_process_count gauge
celery_active_process_count{queue_name="data-manager-search"} 24.0
celery_active_process_count{queue_name="data-manager"} 24.0
celery_active_process_count{queue_name="data-manager-storage"} 24.0

However, for other metrics, the queue is shown as "celery" for queue_name:

# HELP celery_task_received_total Sent when the worker receives a task.
# TYPE celery_task_received_total counter
celery_task_received_total{hostname="data-manager-worker-58664599dd-6pnc9",name="data_manager.tasks.sleeper_task",queue_name="celery"} 6.0
celery_task_received_total{hostname="data-manager-worker-548d9d96d6-vxnzv",name="data_manager.tasks.sleeper_task",queue_name="celery"} 1.0

To test, I created a simple task:

@app.task(queue="data-manager")
def sleeper_task(sleep_time: int):
    import time

    time.sleep(sleep_time)
    return {"status": "ok"}

This is the Celery command I use in my Kubernetes cluster:

containers:
    - name: data-manager-worker
      image: data-manager:latest
      imagePullPolicy: IfNotPresent
      command:
          [
              "celery",
              "-A",
              "data_manager.tasks",
              "worker",
              "--loglevel=INFO",
              "--task-events",
              "-Q",
              "data-manager-storage,data-manager-search,data-manager",
          ]

After troubleshooting, I found that using underscores (_) in the queue names worked properly, while using hyphens (-) caused the issue. metrics.txt

I am unsure whether this issue originates from the celery-exporter or from the way Celery events handle queue names, but the tasks execute successfully with both queue naming conventions.

If there's any other information that i can provide, let me know.

thiarthur avatar Oct 03 '24 03:10 thiarthur

Can you enable the debug log on the exporter and paste the logs here?

danihodovic avatar Oct 15 '24 22:10 danihodovic

I have the same issue. I tried changing the - to _ and it didn't solve the problem.

bashirmindee avatar Apr 10 '25 15:04 bashirmindee

What if you use colons in place of dashes?

danihodovic avatar Apr 10 '25 18:04 danihodovic

even with alphabetic characters alone, I am not able to have the queue name correctly displayed

bashirmindee avatar Apr 11 '25 08:04 bashirmindee

I had queue length of the queue not showing up because I had deactivated remote control on my celery workers (cf)

However, I still have queue_name defaulting to celery in all task related metrics. It seems that the task event received doesn't have a queue attribute which is causing the issue. I am still debugging it

bashirmindee avatar Apr 14 '25 09:04 bashirmindee

I found the problem for queue_name being displayed as celery is caused by not setting task_send_sent_event to True on the producer's end. https://docs.celeryq.dev/en/stable/userguide/configuration.html#task-send-sent-event

worker_send_task_events should be enabled in the worker end.

bashirmindee avatar Apr 14 '25 11:04 bashirmindee

It's may not because of task-send-sent-event, it may be because if someone configure default queue in celery configs as this app.conf.task_default_queue = "default-busy", but just publish the task without mentioning the queue name [in django for example].

For Example:

app.conf.task_default_queue = "default-busy" is set. [Task will go to default-busy queue by default]

Metrics will not be correct if :

# data["queue"] = "default-busy" -> is not present in data 
celery_app.send_task(task, **data). # [produces metrics as queue_name="celery", but the task actually goes to `default-busy`]

Metrics will be correct if :

data["queue"] = "default-busy" -> is present in data 
celery_app.send_task(task, **data)

To overcome this, i suggest to change default queue name in https://github.com/danihodovic/celery-exporter/blob/a9f84b054d9db393f9f5082668e19a89efc7a53b/src/exporter.py#L282

Make it cli variable like CF_DEFAULT_QUEUE_NAME, and replace it like :

"queue_name": getattr(task, "queue", click_params["default_queue_name"]),

Happy to open a PR, if you like.

Refs: https://docs.celeryq.dev/en/stable/userguide/configuration.html#task-default-queue

SilentEntity avatar May 19 '25 13:05 SilentEntity

I think for @thiarthur, the task is called without mentioning the queue_name, as queue_name is defined on task definition.

This might solve the problem.

data["queue"] = "data-manager" -> is present in data 
celery_app.send_task(sleeper_task, **data)

SilentEntity avatar May 19 '25 13:05 SilentEntity

Feel free to open a pull-request @SilentEntity

danihodovic avatar May 19 '25 14:05 danihodovic