superset
superset copied to clipboard
Superset worker restarting and Celery ignoring config values - HELM / Kubernetes / SQS / S3
I am running Superset on Kubernetes (EKS v1.23, HELM chart v.0.7.7, Superset Docker image version "2-0"). I am using SQS as my celery broker and S3 as my results backend and cache. The S3 caching and results backend works, however the setup of using SQS as the broker is not working as expected.
How to reproduce the bug
The worker runs this command for a liveness probe: celery -A superset.tasks.celery_app:app inspect ping -d celery@$HOSTNAME
. However, it is getting this error: Error: No nodes replied within time constraint
. I'm not sure why this is happening, I've followed the following documentation pages and setup everything accordingly:
Celery Configuration - https://docs.celeryq.dev/en/stable/userguide/configuration.html Celery with SQS - https://docs.celeryq.dev/en/stable/getting-started/backends-and-brokers/sqs.html Running Superset on Kubernetes - https://superset.apache.org/docs/installation/running-on-kubernetes Async Queries using Celery - https://superset.apache.org/docs/installation/async-queries-celery
This is my celery configuration inside the values.yaml files.
configOverrides:
enable_s3_caching: |
from s3cache.s3cache import S3Cache
from datetime import timedelta
from flask import Flask
from flask_caching import Cache
from superset.config import *
SQLALCHEMY_DATABASE_URI = f"postgresql://{db_username}:{db_password}@{host_name}/superset"
S3_CACHE_BUCKET = BUCKET_NAME
SQL_LAB_S3_CACHE_KEY_PREFIX = 'sql-lab-result/'
CHARTING_DATA_S3_CACHE_KEY_PREFIX = 'chart-query-results/'
FILTER_STATE_S3_CACHE_KEY_PREFIX = 'filter-state-results/'
EXPLORE_FORM_S3_CACHE_KEY_PREFIX = 'explore-form-results/'
THUMBNAIL_S3_CACHE_KEY_PREFIX = 'thumbnails/'
RESULTS_BACKEND = S3Cache(S3_CACHE_BUCKET, SQL_LAB_S3_CACHE_KEY_PREFIX)
def init_data_cache(app: Flask, config, cache_args, cache_options) -> S3Cache:
return S3Cache(S3_CACHE_BUCKET, CHARTING_DATA_S3_CACHE_KEY_PREFIX)
def init_filter_state_cache(app: Flask, config, cache_args, cache_options) -> S3Cache:
return S3Cache(S3_CACHE_BUCKET, FILTER_STATE_S3_CACHE_KEY_PREFIX)
def init_explore_cache(app: Flask, config, cache_args, cache_options) -> S3Cache:
return S3Cache(S3_CACHE_BUCKET, THUMBNAIL_S3_CACHE_KEY_PREFIX)
def init_thumbnail_cache(app: Flask, config, cache_args, cache_options) -> S3Cache:
return S3Cache(S3_CACHE_BUCKET, EXPLORE_FORM_S3_CACHE_KEY_PREFIX)
THUMBNAIL_CACHE_CONFIG = {'CACHE_TYPE': 'superset_config.init_thumbnail_cache'}
DATA_CACHE_CONFIG = {'CACHE_TYPE': 'superset_config.init_data_cache'}
FILTER_STATE_CACHE_CONFIG = {'CACHE_TYPE': 'superset_config.init_filter_state_cache'}
EXPLORE_FORM_DATA_CACHE_CONFIG = {'CACHE_TYPE': 'superset_config.init_explore_cache'}
SECRET_KEY = f"SECRET_KEY "
ENABLE_PROXY_FIX = True
THUMBNAIL_SELENIUM_USER = SELENIUM_USER
WEBDRIVER_BASEURL = BASE_URL
CELERY_ENABLE_REMOTE_CONTROL = False
class CeleryConfig:
task_queues = None
broker_url = "sqs://"
broker_transport_options = {
'region': 'eu-central-1',
}
imports = ("superset.sql_lab", 'superset.tasks',)
worker_log_level = "DEBUG"
worker_prefetch_multiplier = 1
worker_enable_remote_control = False
task_default_queue = "celery"
#task_acks_late = False
task_annotations = {
"sql_lab.get_sql_results": {"rate_limit": "100/s"},
}
CELERY_CONFIG = CeleryConfig
Expected results
What I expect is that the worker does not fail on the liveness probe. Especially because Superset is able to automatically create a queue named "celery" in SQS. Furthermore, I expect celery not to create "pid" queues because I set worker_enable_remote_control = False
AND CELERY_ENABLE_REMOTE_CONTROL = False
.
Actual results
The worker is restarting because of the error Error: No nodes replied within time constraint
, and worker_enable_remote_control = False
is being totally ignored because I can see multiple queues (see screenshot) being created.
The worker restarts multiple times (see screenshot) and every time it creates a lot of pid queues.
When the worker starts again, this is the output I get from Celery:
Environment
- superset version:
2.0.1
- python version:
3.8.12
- EKS version:
1.23
- Docker tag:
2-0
- HELM Chart version:
0.7.7
- celery[sqs] (pip) version:
5.2.7
- boto3 (pip) version:
1.26.2
- s3werkzeugcache (pip) version:
0.2.1
I would really appreciate your help on this because I can't seem to find anything online about it.