celery-insights icon indicating copy to clipboard operation
celery-insights copied to clipboard

[BUG] GET `/api/workers/stats` Internal Server Error

Open captain828 opened this issue 1 year ago • 2 comments

Describe the bug The worker detail page shows infinite loading for worker, broker and process pool, followed by an error.

To Reproduce Steps to reproduce the behavior:

  1. have a functional Celery setup with at least one worker (see config for specifics)
  2. setup celery-insights via docker with: docker run -p 8555:8555 --name celery-insights -e BROKER_URL=redis://host.docker.internal:6379 -e RESULT_BACKEND=redis://host.docker.internal:6379 ghcr.io/danyi1212/celery-insights:latest
  3. go to the root celery insights page
  4. click on view for the worker
  5. observe errors in the worker, broker and process pool section

Expected behavior I would like to see the same statistics as on the demo for workers, brokers and process pools.

Screenshots image image

Desktop (please complete the following information):

  • OS: Windows 10 Pro
  • Browser: Chrome
  • Version: 119.0.6045.105 (Official Build) (64-bit)

Celery config: settings.py

# Celery
CELERY_BROKER_URL = CACHES['default']['LOCATION']  # redis://host.docker.internal:6379
CELERY_BROKER_CONNECTION_RETRY_ON_STARTUP = True
CELERY_RESULT_BACKEND = CELERY_BROKER_URL
CELERY_RESULT_BACKEND_MAX_RETRIES = 5
CELERY_RESULT_COMPRESSION = 'brotli'
CELERY_RESULT_EXTENDED = True
CELERY_TASK_ACKS_LATE = True
CELERY_TASK_SEND_SENT_EVENT = True
CELERY_TASK_DEFAULT_QUEUE = 'default'
CELERY_TASK_COMPRESSION = CELERY_RESULT_COMPRESSION
CELERY_WORKER_DEDUPLICATE_SUCCESSFUL_TASKS = True
CELERY_WORKER_MAX_MEMORY_PER_CHILD = 128000  # 128 MB
CELERY_WORKER_CANCEL_LONG_RUNNING_TASKS_ON_CONNECTION_LOSS = True
CELERY_WORKER_SEND_TASK_EVENTS = True
CELERY_WORKER_POOL = 'threads'

# Celery Beat
CELERY_BEAT_SCHEDULER = 'redbeat.RedBeatScheduler'

Celery worker command:

celery -A config worker -Q default --hostname default@%%h --loglevel INFO --prefetch-multiplier 1 --time-limit 30 --max-tasks-per-child 100 --autoscale 4,16

Additional context Error trace:

2023-11-10 11:25:20 INFO:     172.17.0.1:44790 - "GET /api/workers/stats?timeout=10&worker=default%40CAPTAIN-PC HTTP/1.1" 500 Internal Server Error
2023-11-10 11:25:20 ERROR:    Exception in ASGI application
2023-11-10 11:25:20 Traceback (most recent call last):
2023-11-10 11:25:20   File "/usr/local/lib/python3.11/site-packages/uvicorn/protocols/http/h11_impl.py", line 428, in run_asgi
2023-11-10 11:25:20     result = await app(  # type: ignore[func-returns-value]
2023-11-10 11:25:20              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2023-11-10 11:25:20   File "/usr/local/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 78, in __call__
2023-11-10 11:25:20     return await self.app(scope, receive, send)
2023-11-10 11:25:20            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2023-11-10 11:25:20   File "/usr/local/lib/python3.11/site-packages/fastapi/applications.py", line 276, in __call__
2023-11-10 11:25:20     await super().__call__(scope, receive, send)
2023-11-10 11:25:20   File "/usr/local/lib/python3.11/site-packages/starlette/applications.py", line 122, in __call__
2023-11-10 11:25:20     await self.middleware_stack(scope, receive, send)
2023-11-10 11:25:20   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 184, in __call__
2023-11-10 11:25:20     raise exc
2023-11-10 11:25:20   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 162, in __call__
2023-11-10 11:25:20     await self.app(scope, receive, _send)
2023-11-10 11:25:20   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/cors.py", line 84, in __call__
2023-11-10 11:25:20     await self.app(scope, receive, send)
2023-11-10 11:25:20   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
2023-11-10 11:25:20     raise exc
2023-11-10 11:25:20   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
2023-11-10 11:25:20     await self.app(scope, receive, sender)
2023-11-10 11:25:20   File "/usr/local/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 21, in __call__
2023-11-10 11:25:20     raise e
2023-11-10 11:25:20   File "/usr/local/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
2023-11-10 11:25:20     await self.app(scope, receive, send)
2023-11-10 11:25:20   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 718, in __call__
2023-11-10 11:25:20     await route.handle(scope, receive, send)
2023-11-10 11:25:20   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 276, in handle
2023-11-10 11:25:20     await self.app(scope, receive, send)
2023-11-10 11:25:20   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 66, in app
2023-11-10 11:25:20     response = await func(request)
2023-11-10 11:25:20                ^^^^^^^^^^^^^^^^^^^
2023-11-10 11:25:20   File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 255, in app
2023-11-10 11:25:20     content = await serialize_response(
2023-11-10 11:25:20               ^^^^^^^^^^^^^^^^^^^^^^^^^
2023-11-10 11:25:20   File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 141, in serialize_response
2023-11-10 11:25:20     raise ValidationError(errors, field.type_)
2023-11-10 11:25:20 pydantic.error_wrappers.ValidationError: 6 validation errors for Stats
2023-11-10 11:25:20 response -> default@CAPTAIN-PC -> broker -> login_method
2023-11-10 11:25:20   none is not an allowed value (type=type_error.none.not_allowed)
2023-11-10 11:25:20 response -> default@CAPTAIN-PC -> broker -> userid
2023-11-10 11:25:20   none is not an allowed value (type=type_error.none.not_allowed)
2023-11-10 11:25:20 response -> default@CAPTAIN-PC -> pool -> max-tasks-per-child
2023-11-10 11:25:20   field required (type=value_error.missing)
2023-11-10 11:25:20 response -> default@CAPTAIN-PC -> pool -> processes
2023-11-10 11:25:20   field required (type=value_error.missing)
2023-11-10 11:25:20 response -> default@CAPTAIN-PC -> pool -> timeouts
2023-11-10 11:25:20   field required (type=value_error.missing)
2023-11-10 11:25:20 response -> default@CAPTAIN-PC -> rusage
2023-11-10 11:25:20   value is not a valid dict (type=type_error.dict)

captain828 avatar Nov 10 '23 09:11 captain828

I'm relatively new to celery and was attempting to get Celery Insights to work and was getting similar errors.

To troubleshoot my setup I implemented the same CELERY_* settings in my django settings.py file and updating my compose command for my celery service. I managed to get it working by disabling the setting CELERY_WORKER_POOL='threads', adding the setting CELERY_BROKER_LOGIN_METHOD="NONE", and updating the redis transport URLs to redis://default:[email protected]:6379

Adding CELERY_BROKER_LOGIN_METHOD="NONE" resolved the error(s):

2023-11-10 11:25:20 response -> default@CAPTAIN-PC -> broker -> login_method
2023-11-10 11:25:20   none is not an allowed value (type=type_error.none.not_allowed)

Removing CELERY_WORKER_POOL='threads' resolved the error(s):

2023-11-10 11:25:20 response -> default@CAPTAIN-PC -> pool -> max-tasks-per-child
2023-11-10 11:25:20   field required (type=value_error.missing)
2023-11-10 11:25:20 response -> default@CAPTAIN-PC -> pool -> processes
2023-11-10 11:25:20   field required (type=value_error.missing)
2023-11-10 11:25:20 response -> default@CAPTAIN-PC -> pool -> timeouts
2023-11-10 11:25:20   field required (type=value_error.missing)
2023-11-10 11:25:20 response -> default@CAPTAIN-PC -> rusage
2023-11-10 11:25:20   value is not a valid dict (type=type_error.dict)

Updating the transport URL to use the default users resolve the error(s):

2023-11-10 11:25:20 response -> default@CAPTAIN-PC -> broker -> userid
2023-11-10 11:25:20   none is not an allowed value (type=type_error.none.not_allowed)

While im not sure what setting CELERY_WORKER_POOL to threads does exactly based on the above I presume Celery Inisghts connection wants those related settings to exist; though I was unable to find them.

Hope this helps

~Regards Amorik


Props to @JTX for commenting the resolution here:

  • https://stackoverflow.com/questions/46569432/does-redis-use-a-username-for-authentication/78236235#78236235

Amorik avatar Apr 16 '24 01:04 Amorik

Hey @captain828 and @Amorik, thank you for your feedback! Some of the issues mentioned above were fixed in version v0.2.0.

To upgrade, run docker pull with the latest or specifically the 0.2.0 version:

docker pull ghcr.io/danyi1212/celery-insights:latest

Could you please upgrade to the latest version and update if those errors still occurring?

danyi1212 avatar Sep 09 '24 23:09 danyi1212