superset
superset copied to clipboard
fix: Chart cache-warmup task fails on Superset 4.0
SUMMARY
Fixes #28705
We found our chart cache warm-up task (as in the example in the Kubernetes docs) was failing with this error:
[2024-05-24 22:30:00,045: INFO/ForkPoolWorker-1] cache-warmup[ae85f28c-acfa-4bb1-a885-e7f0121610b9]: Loading strategy
[2024-05-24 22:30:00,045: INFO/ForkPoolWorker-1] cache-warmup[ae85f28c-acfa-4bb1-a885-e7f0121610b9]: Loading TopNDashboardsStrategy
[2024-05-24 22:30:00,046: INFO/ForkPoolWorker-1] cache-warmup[ae85f28c-acfa-4bb1-a885-e7f0121610b9]: Success!
HTTPException
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1823, in full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1788, in dispatch_request
self.raise_routing_exception(req)
File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1770, in raise_routing_exception
raise request.routing_exception # type: ignore
File "/usr/local/lib/python3.10/site-packages/flask/ctx.py", line 351, in match_request
result = self.url_adapter.match(return_rule=True) # type: ignore
File "/usr/local/lib/python3.10/site-packages/werkzeug/routing/map.py", line 619, in match
raise MethodNotAllowed(valid_methods=list(e.have_match_for)) from None
werkzeug.exceptions.MethodNotAllowed: 405 Method Not Allowed: The method is not allowed for the requested URL.
[repeated]
On investigation I found that:
- The
cache-warmuptask callsPUT /superset/warm_up_caches/(via thefetch_urltask) /superset/warm_up_caches/(Superset.warm_up_caches) exists but is a GET request- That endpoint is deprecated by
/api/v1/chart/warm_up_caches(ChartRestApi.warm_up_cache) (scheduled for deletion in 4.0 but I couldn't find discussions around that)
This PR changes the task to call the newer endpoint (accidentally reverted in 56069b05f9cf4d0c725d1b4b0ad6038b50837cd4).
I also ran into CSRF issues (similar to discussion in #27160), so added a method to fetch this (thanks to Will Gan on slack for the hint). Perhaps this can be placed more centrally? This appears to work on installations with CSRF disabled (I've tested with WTF_CSRF_ENABLED = False).
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
TESTING INSTRUCTIONS
- Ensure you have a scheduled job to warmup caches, e.g. by adding to CeleryConfig:
class CeleryConfig: broker_url = f"redis://{REDIS_HOST}:{REDIS_PORT}/{REDIS_CELERY_DB}" - imports = ("superset.sql_lab",) + imports = ( + "superset.sql_lab", + "superset.tasks.cache", + ) result_backend = f"redis://{REDIS_HOST}:{REDIS_PORT}/{REDIS_RESULTS_DB}" worker_prefetch_multiplier = 1 task_acks_late = False @@ -87,6 +90,13 @@ class CeleryConfig: "task": "reports.prune_log", "schedule": crontab(minute=10, hour=0), }, + "cache-warmup-dummy": { + "task": "cache-warmup", + "schedule": crontab(minute="*/1", hour="*"), + "kwargs": { + "strategy_name": "dummy", + }, + }, } - If starting from an empty installation, create a dashboard, with at least one data-backed chart, and publish it
- Each minute you should see log lines like:
as compared to this on master:superset_worker | [2024-05-25 00:18:03,203: INFO/ForkPoolWorker-1] fetch_url[b9a9c5c6-41e0-491f-a8cb-d49c0b9ab396]: Fetching http://superset_app:8088/api/v1/security/csrf_token/ superset_app | 2024-05-25 00:18:03,198:DEBUG:superset.stats_logger:[stats_logger] (timing) ChartRestApi.warm_up_cache.time | 176.14193800545763 superset_app | 2024-05-25 00:18:03,200:INFO:werkzeug:192.168.165.7 - - [25/May/2024 00:18:03] "PUT /api/v1/chart/warm_up_cache HTTP/1.1" 200 -superset_worker | [2024-05-25 00:26:00,162: INFO/MainProcess] Task fetch_url[a488cfb6-1d63-42bb-be00-a3e6af3e2c4a] received superset_worker | [2024-05-25 00:26:00,165: INFO/ForkPoolWorker-1] fetch_url[9a054a67-acf0-4fd8-b0dd-42ab28da886b]: Fetching http://superset_app:8088/superset/warm_up_cache/ with payload {"chart_id": 156} superset_app | 2024-05-25 00:26:00,168:WARNING:superset.views.base:HTTPException superset_app | Traceback (most recent call last): superset_app | File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1484, in full_dispatch_request superset_app | rv = self.dispatch_request() superset_app | File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1458, in dispatch_request superset_app | self.raise_routing_exception(req) superset_app | File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1440, in raise_routing_exception superset_worker | [2024-05-25 00:26:00,167: INFO/ForkPoolWorker-7] cache-warmup[57b6894a-e9a2-4fba-bf8e-2e7d74826f61]: Scheduling {"chart_id": 156} superset_app | raise request.routing_exception # type: ignore superset_app | File "/usr/local/lib/python3.10/site-packages/flask/ctx.py", line 353, in match_request superset_app | result = self.url_adapter.match(return_rule=True) # type: ignore superset_app | File "/usr/local/lib/python3.10/site-packages/werkzeug/routing/map.py", line 624, in match superset_app | raise MethodNotAllowed(valid_methods=list(e.have_match_for)) from None superset_app | werkzeug.exceptions.MethodNotAllowed: 405 Method Not Allowed: The method is not allowed for the requested URL. superset_app | 2024-05-25 00:26:00,170:INFO:werkzeug:192.168.165.8 - - [25/May/2024 00:26:00] "PUT /superset/warm_up_cache/ HTTP/1.1" 405 -
A clearer example on my clean install (based on 4.0.0):
superset_worker_beat | [2024-05-24 23:38:00,065: INFO/MainProcess] Scheduler: Sending due task cache-warmup-hourly (cache-warmup)
superset_worker | [2024-05-24 23:38:00,084: INFO/MainProcess] Task cache-warmup[c8b9a20b-3fea-45e5-a9fc-9a2fdcfa7b40] received
superset_worker | [2024-05-24 23:38:00,093: INFO/ForkPoolWorker-7] cache-warmup[c8b9a20b-3fea-45e5-a9fc-9a2fdcfa7b40]: Loading strategy
superset_worker | [2024-05-24 23:38:00,094: INFO/ForkPoolWorker-7] cache-warmup[c8b9a20b-3fea-45e5-a9fc-9a2fdcfa7b40]: Loading DummyStrategy
superset_worker | [2024-05-24 23:38:00,095: INFO/ForkPoolWorker-7] cache-warmup[c8b9a20b-3fea-45e5-a9fc-9a2fdcfa7b40]: Success!
superset_worker | [2024-05-24 23:38:00,161: INFO/ForkPoolWorker-7] cache-warmup[c8b9a20b-3fea-45e5-a9fc-9a2fdcfa7b40]: Scheduling {"chart_id": 1}
superset_worker | [2024-05-24 23:38:00,181: INFO/MainProcess] Task fetch_url[efd6a561-9c01-455c-aaae-bf8d4e6fb88e] received
superset_worker | [2024-05-24 23:38:00,185: INFO/ForkPoolWorker-7] Task cache-warmup[c8b9a20b-3fea-45e5-a9fc-9a2fdcfa7b40] succeeded in 0.09269619999395218s: {'scheduled': ['{"chart_id": 1}'], 'errors': []}
superset_worker | [2024-05-24 23:38:00,202: INFO/ForkPoolWorker-8] fetch_url[efd6a561-9c01-455c-aaae-bf8d4e6fb88e]: Fetching http://webserver.superset.local:8088/api/v1/security/csrf_token/
superset_app | 2024-05-24 23:38:00,259:INFO:werkzeug:192.168.214.6 - - [24/May/2024 23:38:00] "GET /api/v1/security/csrf_token/ HTTP/1.1" 200 -
superset_worker | [2024-05-24 23:38:00,261: INFO/ForkPoolWorker-8] fetch_url[efd6a561-9c01-455c-aaae-bf8d4e6fb88e]: Fetching http://webserver.superset.local:8088/api/v1/chart/warm_up_cache with payload {"chart_id": 1}
superset_app | 2024-05-24 23:38:02,332:INFO:werkzeug:192.168.214.6 - - [24/May/2024 23:38:02] "PUT /api/v1/chart/warm_up_cache HTTP/1.1" 200 -
superset_worker | [2024-05-24 23:38:02,333: INFO/ForkPoolWorker-8] fetch_url[efd6a561-9c01-455c-aaae-bf8d4e6fb88e]: Fetched http://webserver.superset.local:8088/api/v1/chart/warm_up_cache with payload {"chart_id": 1}, status code: 200
superset_worker | [2024-05-24 23:38:02,335: INFO/ForkPoolWorker-8] Task fetch_url[efd6a561-9c01-455c-aaae-bf8d4e6fb88e] succeeded in 2.1522701479989337s: {'success': '{"chart_id": 1}', 'response': '{"result":[{"chart_id":1,"viz_error":null,"viz_status":"success"}]}
ADDITIONAL INFORMATION
- [x] Has associated issue: #28705
- [ ] Required feature flags:
- [ ] Changes UI
- [ ] Includes DB Migration (follow approval process in SIP-59)
- [ ] Migration is atomic, supports rollback & is backwards-compatible
- [ ] Confirm DB migration upgrade and downgrade tested
- [ ] Runtime estimates and downtime expectations provided
- [ ] Introduces new feature or API
- [ ] Removes existing feature or API
@gooroodev please review
Summary of Changes
-
New Functionality:
- Added a new function
fetch_csrf_tokento fetch a CSRF token for API requests. - Integrated the
fetch_csrf_tokenfunction into thefetch_urltask to ensure that CSRF tokens are included in the headers for API requests.
- Added a new function
-
Modifications:
- Updated the
fetch_urlmethod to useChartRestApi.warm_up_cacheinstead ofSuperset.warm_up_cache. - Modified the integration test
test_fetch_urlto include tests for the newfetch_csrf_tokenfunction.
- Updated the
-
Tests:
- Added a new test
test_fetch_csrf_tokento verify the functionality of fetching CSRF tokens. - Updated the
test_fetch_urlto mock and validate the behavior offetch_csrf_token.
- Added a new test
Issues, Bugs, or Typos
-
Error Handling in
fetch_csrf_token:- The function
fetch_csrf_tokenlogs an error but does not raise an exception or handle the error case effectively. This can lead to silent failures.
- The function
-
Variable Naming:
- The variable
session_cookieshould be checked forNonebefore being added to the headers to avoid potential issues if the cookie is not set.
- The variable
-
Test Improvements:
- The test
test_fetch_csrf_tokenshould include a case where the response code is not 200 to ensure error handling is tested.
- The test
Proposed Code Improvements
Improvement 1: Error Handling in fetch_csrf_token
Current Code:
logger.error("Error fetching CSRF token, status code: %s", response.code)
return {}
Improved Code:
logger.error("Error fetching CSRF token, status code: %s", response.code)
raise Exception(f"Failed to fetch CSRF token, status code: {response.code}")
Improvement 2: Check for None in session_cookie
Current Code:
session_cookie = response.headers.get("Set-Cookie")
if response.code == 200:
data = json.loads(body)
return {"Cookie": session_cookie, "X-CSRF-Token": data["result"]}
Improved Code:
session_cookie = response.headers.get("Set-Cookie")
if response.code == 200 and session_cookie:
data = json.loads(body)
return {"Cookie": session_cookie, "X-CSRF-Token": data["result"]}
else:
logger.error("Error fetching CSRF token or session cookie missing, status code: %s", response.code)
raise Exception(f"Failed to fetch CSRF token or session cookie missing, status code: {response.code}")
Improvement 3: Test for Non-200 Response Code
Current Test:
def test_fetch_csrf_token(mock_urlopen, mock_request_cls, base_url, app_context):
# ... existing test code ...
Improved Test:
def test_fetch_csrf_token(mock_urlopen, mock_request_cls, base_url, app_context):
from superset.tasks.cache import fetch_csrf_token
mock_request = mock.MagicMock()
mock_request_cls.return_value = mock_request
mock_response = mock.MagicMock()
mock_urlopen.return_value.__enter__.return_value = mock_response
# Test for successful response
mock_response.code = 200
mock_response.read.return_value = b'{"result": "csrf_token"}'
mock_response.headers.get.return_value = "new_session_cookie"
app.config["WEBDRIVER_BASEURL"] = base_url
headers = {"Cookie": "original_session_cookie"}
result_headers = fetch_csrf_token(headers)
mock_request_cls.assert_called_with(
"http://base-url/api/v1/security/csrf_token/",
headers=headers,
method="GET",
)
assert result_headers["X-CSRF-Token"] == "csrf_token"
assert result_headers["Cookie"] == "new_session_cookie"
mock_urlopen.assert_called_once_with(mock_request, timeout=mock.ANY)
# Test for unsuccessful response
mock_response.code = 500
with pytest.raises(Exception):
fetch_csrf_token(headers)
General Review of Code Quality and Style
- Code Quality: The code is generally well-structured and follows good practices. The addition of the
fetch_csrf_tokenfunction abstracts the CSRF token fetching logic, making thefetch_urlfunction cleaner and more maintainable. - Code Style: The code follows PEP 8 guidelines, and the use of type hints improves readability and maintainability.
- Logging: The use of logging is appropriate for debugging and tracking the flow of the program, but error handling could be more robust.
- Testing: The tests cover the new functionality well, but additional edge cases and error conditions should be tested to ensure robustness.
Yours, Gooroo.dev. To receive reviews automatically, install Github App
Codecov Report
Attention: Patch coverage is 94.28571% with 2 lines in your changes missing coverage. Please review.
Project coverage is 83.65%. Comparing base (
76d897e) to head (620502d). Report is 1701 commits behind head on master.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| superset/tasks/utils.py | 93.75% | 2 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## master #28706 +/- ##
===========================================
+ Coverage 60.48% 83.65% +23.16%
===========================================
Files 1931 528 -1403
Lines 76236 38181 -38055
Branches 8568 0 -8568
===========================================
- Hits 46114 31941 -14173
+ Misses 28017 6240 -21777
+ Partials 2105 0 -2105
| Flag | Coverage Δ | |
|---|---|---|
| hive | 48.93% <31.42%> (-0.22%) |
:arrow_down: |
| javascript | ? |
|
| mysql | 76.66% <94.28%> (?) |
|
| postgres | 76.75% <94.28%> (?) |
|
| presto | 53.48% <31.42%> (-0.32%) |
:arrow_down: |
| python | 83.65% <94.28%> (+20.15%) |
:arrow_up: |
| sqlite | 76.24% <94.28%> (?) |
|
| unit | 60.28% <28.57%> (+2.65%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
- :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.
Is there anything I can do to help get this merged?
I'm also interested in working cache warmups
- PEP
Feel free to give it a code review, and/or answer any of the open questions on the thread.
Still hoping to get an opinion from @betodealmeida and/or @dpgaspar , but I'll add @john-bodley to the mix as well.
I also want to use cache-warmup in Superset 4.0.2. When is it planned to include this fix? Thank you!
Thanks for providing solution, however this wouldn't work if someone is using superset-websockets as in the headers there are two "Set-Cookie"'s first one is an async-token and second session token, your code will get the first one which is wrong and you'll get 404 UNATHORIZED
I also want to use cache-warmup in Superset 4.0.2. When is it planned to include this fix? Thank you!
I installed the version 4.0.2, and applied the fix suggested by @rmasters in the cache.py file. And now it's working.
Before the fix, I was getting the 405 error.
Would love to close the related issue, but I'm not sure of the risks involved here (particularly with websockets added to the mix). Maybe @villebro or @nytai would have a look here and see if this seems save to merge.
@rmasters this needs a rebase, and I don't have permissions to push changes to your branch to do it. Can you rebase, and optionally make that utils refactor?
@cmartinezdnn Have you tested that condition??
Looks like some CI checks aren't running for whatever reason. Might be in need of the aforementioned rebase, but I'll close and reopen this to see if that kick-starts it.
You think that this could be merged soon? What I need to change in the code to fix that issue manually?
I also want to use cache-warmup in Superset 4.0.2. When is it planned to include this fix? Thank you!
I installed the version 4.0.2, and applied the fix suggested by @rmasters in the cache.py file. And now it's working.
Before the fix, I was getting the 405 error.
Hi, can you please deliberate on what did you change so it will work for you? I'm encountering the same issue
Yeah... any traction on this would be amazing
Apologies for the radio silence; I don't seem to be getting github notifications through to my email...
Many thanks for the reviews. I'll rebase and pop that refactor to utils in shortly.
@talfour Thanks for providing solution, however this wouldn't work if someone is using superset-websockets as in the headers there are two "Set-Cookie"'s first one is an async-token and second session token, your code will get the first one which is wrong and you'll get 404 UNATHORIZED
I'm not familiar with the websocket server, but am I right in understanding that it should not pick up the async/websocket-server cookie, or does it proxy requests at all? It doesn't appear that the FAB session cookie name can be changed from session so I can filter for that (it has been years since I used FAB though so if anyone knows different please lmk).
Edit: I've popped a change in that parses out all the set-cookie headers and picks the one named session - hopefully this does the trick
Hey @villebro, I've rebased onto master, split out the csrf-token fetcher and made a fix towards the session cookie grabber, if you can trigger the CI for me please 🚀
@rmasters thanks for the rebase. Any changes we need to add in terms of ingress annotation for the chart?
I've started CI and will try to review this tomorrow (PST)
Waiting for flaky tests to rerun, will try to merge as soon as CI passes.
One last flaky test to go.. 🤦
Hi in which version this PR is going to be released?