refactor: Allow superset to be deployed under a prefixed URL
SUMMARY
Introduces changes in the front and backend to allow Superset to be deployed under a prefixed-URL path rather than the root of the domain. A new environment variable BASE_PATH has been introduced into webpack, alongside the existing ASSET_BASE_URL, to define the prefix.
The frontend has been updated to include the prefix in any resource links along with two new helper utilities, assetUrl & pathUtils, to help construct the paths. The SupersetClient has been update to remove the unused baseUrl field and rename it as basePath such that the object can be initialized with the base path and calling any endpoints through this class can use the same relative link as before. QUESTION: Have I broken an API here?
The backend has been refactored to avoid using hardcoded paths in redirects and instead uses Flask.url_for to construct reference. This requires that the proxy server set the X-Forwarded-Prefix header along with setting ENABLE_PROXY_FIX=True in superset_config.py
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
TESTING INSTRUCTIONS
ADDITIONAL INFORMATION
- [ ] Has associated issue:
- [ ] Required feature flags:
- [ ] Changes UI
- [ ] Includes DB Migration (follow approval process in SIP-59)
- [ ] Migration is atomic, supports rollback & is backwards-compatible
- [ ] Confirm DB migration upgrade and downgrade tested
- [ ] Runtime estimates and downtime expectations provided
- [ ] Introduces new feature or API
- [ ] Removes existing feature or API
Thanks for tackling this... it's a popular request, no doubt about it. If you think it's relevant, we might also want to add an entry in the documentation somewhere.
Thanks for tackling this... it's a popular request, no doubt about it. If you think it's relevant, we might also want to add an entry in the documentation somewhere.
Yes agreed I definitely need to add some documentation for how to use this. I wanted to wait until I was more confident that the general approach was acceptable before working on describing how to use it. In terms of a review as part of #26319 is there anything I need to do at the moment?
I think this may be quite an undertaking and might have to be tackled in phases. Would love to see this done :)
I remember giving it a shot early on in the project at least once, and never wrapped up the work. My main advice would be to avoid taking in side mission or bigger refactor while tackling this.
Thanks for tackling this... it's a popular request, no doubt about it. If you think it's relevant, we might also want to add an entry in the documentation somewhere.
Yes agreed I definitely need to add some documentation for how to use this. I wanted to wait until I was more confident that the general approach was acceptable before working on describing how to use it. In terms of a review as part of #26319 is there anything I need to do at the moment?
Thanks for these changes. Can you please mention values for BASE_PATH and ASSET_BASE_URL. Just wanted to check these changes with superset and nginx.
For what it's worth, we've been running this fork to serve superset on a path prefix in a Kubernetes(K3S) + traefik context without any problems so far. Though testing has been limited.
Thanks for the work so far
Thanks for tackling this... it's a popular request, no doubt about it. If you think it's relevant, we might also want to add an entry in the documentation somewhere.
Yes agreed I definitely need to add some documentation for how to use this. I wanted to wait until I was more confident that the general approach was acceptable before working on describing how to use it. In terms of a review as part of #26319 is there anything I need to do at the moment?
Thanks for these changes. Can you please mention values for BASE_PATH and ASSET_BASE_URL. Just wanted to check these changes with superset and nginx.
Sorry for the delay in replying to this. I've had zero time to spend on this lately.
The latest commits rebase on the current master and I have also added some documentation to the configuring superset pages that describe the new variables with an example nginx.conf. Does that help?
@rusackas I think this is ready for review but in terms of it being related to SIP-112 is that the right thing to do in terms of the process?
If you think it's ready for review, I'd say you can go ahead and move it from Draft to "Ready for Review" - though it looks like CI might need a little more love still.
As for the SIP, since @sfirke reopened the GitHub issue, I set the SIP itself back to "pre-discussion" state for its official consensus. Did you want to re-kindle that pre-existing SIP, or would it be easier to open a new one? I'm happy to help you carry that SIP through to fruition if you'd like. Feel free to DM me on slack if it's easier.
When clicking on the user list or role list, and then clicking on datasets, a 404 error appears.
Thanks for tackling this... it's a popular request, no doubt about it. If you think it's relevant, we might also want to add an entry in the documentation somewhere.
Yes agreed I definitely need to add some documentation for how to use this. I wanted to wait until I was more confident that the general approach was acceptable before working on describing how to use it. In terms of a review as part of #26319 is there anything I need to do at the moment?
Thanks for these changes. Can you please mention values for BASE_PATH and ASSET_BASE_URL. Just wanted to check these changes with superset and nginx.
Sorry for the delay in replying to this. I've had zero time to spend on this lately.
The latest commits rebase on the current master and I have also added some documentation to the configuring superset pages that describe the new variables with an example
nginx.conf. Does that help?
Thanks a lot for the response.
We tried the changes as recommended and observed that most of the things are working when we specify prefix. However there are few things where prefix is not applied correctly as shown in the image below (chart menu has prefix applied correctly)
Also dashboard menu items export to PDF/image not working (Dashboard -> Download -> Export to PDF/Download as Image)
Thanks for tackling this... it's a popular request, no doubt about it. If you think it's relevant, we might also want to add an entry in the documentation somewhere.
Yes agreed I definitely need to add some documentation for how to use this. I wanted to wait until I was more confident that the general approach was acceptable before working on describing how to use it. In terms of a review as part of #26319 is there anything I need to do at the moment?
Thanks for these changes. Can you please mention values for BASE_PATH and ASSET_BASE_URL. Just wanted to check these changes with superset and nginx.
Sorry for the delay in replying to this. I've had zero time to spend on this lately. The latest commits rebase on the current master and I have also added some documentation to the configuring superset pages that describe the new variables with an example
nginx.conf. Does that help?Thanks a lot for the response. We tried the changes as recommended and observed that most of the things are working when we specify prefix. However there are few things where prefix is not applied correctly as shown in the image below (chart menu has prefix applied correctly)
Also dashboard menu items export to PDF/image not working (Dashboard -> Download -> Export to PDF/Download as Image)
I tried to fix above issue of missing prefixes in top-level menu. Please check this PR 2
The solution really works. Solved a long term issue in our deployment behind proxies. Hoping to get merged soon with official code base
Very good, looking forward to merging the code soon
For whom it may concern, I'm testing with a port of this work to 3.1.1rc1 available from watercraft:superset branch prefix
Hi, Good job there, thank you very much for tackling this. I have just discovered that Superset did not support that and was feeling disappointed.
I can see this PR is still flagged as DRAFT. What's left to do to get it "Ready for Review" ?
@martyngigg do you need help in completing this PR ? I'm quite a superset newbie, but I'm willing to provide some help if necessary to get this merged.
Dear all,
Thank you so much for taking the time to test out these changes and I apologise (again!) for the lack of response to any comments. @ravikantkml Thanks for the PR fixing the menu links - I think that looks okay and I'll merge it.
I think the remaining issues are fixing the issues with screenshot generation. I had a brief look at this today and I'm having trouble getting them working even without any prefixed path set - I must have an issue in my development setup.
Today is my last day of work before Christmas break but I will have time when I return to pick this up properly. I'll aim to get it to "Ready for Review" as soon as possible after the holidays. @jeanpommier Thanks for the offer of help too - If you wanted to try and tackle the screenshot generation problem then that would be super helpful but equally I'm happy to look at it after the break.
Happy Holidays! 🎅🏻
testing the PR here outside of the docker context (eg pip install from that github branch branch, rebuild the assets with BASE_PATH set), that work is much welcome.
some notes so far.. i think it should be mentioned that APP_ICON and FAVICON needs BASE_PATH prepended, otherwise it generates 404's here with the default value from config.py
looking a bit more at the code/PR, all assets below STATIC_ASSETS_PREFIX also need to default to BASE_PATH ? that fixes the FAVICON while here.
with this config the superset homepage loads fine with all its assets/images/favicon:
BASE_PATH = "/dashboards"
APP_ICON = BASE_PATH + "/static/assets/images/superset-logo-horiz.png"
STATIC_ASSETS_PREFIX = BASE_PATH
edit oh, i was pretty sure i saw those bits somewhere.. they are in the docker part of the PR in https://github.com/apache/superset/pull/30134/files#diff-415de08cbf9e8633d6cf2c439df7aa08bf9ebce684833ee837d6ce7d7ab7d53b. should those bits be documented, or copied over to the default superset config.py ?
Happy New Year!
I've investigated further and found the issue with "Export to PDF" & "Download image" not working on dashboards. The issue seems to be the interaction of the headless browser and WEBDRIVER_BASEURL. In the docker compose setup superset_config.py suggests setting WEBDRIVER_BASEURL=http://superset_app:8088/, so accessing the superset instance directly and skipping the nginx reverse proxy. This is fine when there is no url prefix set as the paths that both the reverse proxy and the superset app see are the same. When there is a url prefix set it is baked into the javascript files that webpack builds and when accessing the superset app directly the extra prefix part of the path raises an error in the superset app and the screenshot generation fails.
A fix/workaround (??) seems to be to set WEBDRIVER_BASEURL = f"http://superset_nginx{base_path}/" but this also requires removing this with statement that seems to set a test context that doesn't know about the prefix. I'm not sure of the knock on effects of this. @rusackas Would you know someone who could comment on this or possibly suggest an alternative approach?
Hello ! Happy new year !
This is fine when there is no url prefix set as the paths that both the reverse proxy and the superset app see are the same. When there is a url prefix set it is baked into the javascript files that webpack builds and when accessing the superset app directly the extra prefix part of the path raises an error in the superset app and the screenshot generation fails.
Well, I might have a solution for this one. I didn't get the time to look at the image generation task you proposed, but I was rather investigating around the path prefix topic. My concern was to avoid relying on a reverse-proxy for the path-prefix thing. IMO, it makes things complicated. For instance the issue you met above: dev setup (without docker compo) and prod-like setup with a reverse proxy works differently, with a different frontend where the path is hard-coded. Looking on the backend side of it, my first idea was that we maybe didn't use the Flask blueprints to their full extent, but it turns out this would not be that easy to adapt to our needs. And maybe not a clean solution.
But, I stumbled across this blog post where I discovered that WSGI protocol supports a parameter that sets such a path prefix, and that is handled on the WSGI side of it (e.g. gunicorn). I tested it (on top of your PR since we need the url_for links) and it works like a charm.
What does it change ?
Supposing you choose analytics as the path prefix, the backend is accessed on http://superset_app:8088/analytics/. and the reverse-proxy has a much simpler job of just routing /analytics to http://superset_app:8088/analytics/.
On your above-mentioned issue, this means that the paths will concord between the front and back end, so it should work without even needing the WEBDRIVER_BASEURL fix you propose.
I'm still quite uncomfortable at the necessity to build the frontend to support a path other than root. I haven't made experiments yet on this, but wouldn't it be conceivable that the frontend looks for a config instruction from the backend to get this information ?
Hello ! Happy new year !
This is fine when there is no url prefix set as the paths that both the reverse proxy and the superset app see are the same. When there is a url prefix set it is baked into the javascript files that webpack builds and when accessing the superset app directly the extra prefix part of the path raises an error in the superset app and the screenshot generation fails.
Well, I might have a solution for this one. I didn't get the time to look at the image generation task you proposed, but I was rather investigating around the path prefix topic. My concern was to avoid relying on a reverse-proxy for the path-prefix thing. IMO, it makes things complicated. For instance the issue you met above: dev setup (without docker compo) and prod-like setup with a reverse proxy works differently, with a different frontend where the path is hard-coded. Looking on the backend side of it, my first idea was that we maybe didn't use the Flask blueprints to their full extent, but it turns out this would not be that easy to adapt to our needs. And maybe not a clean solution.
But, I stumbled across this blog post where I discovered that WSGI protocol supports a parameter that sets such a path prefix, and that is handled on the WSGI side of it (e.g. gunicorn). I tested it (on top of your PR since we need the url_for links) and it works like a charm.
What does it change ?
Supposing you choose
analyticsas the path prefix, the backend is accessed on http://superset_app:8088/analytics/. and the reverse-proxy has a much simpler job of just routing /analytics to http://superset_app:8088/analytics/. On your above-mentioned issue, this means that the paths will concord between the front and back end, so it should work without even needing the WEBDRIVER_BASEURL fix you propose.
@jeanpommier Thanks for exploring and trying out the idea with having the backend also run on the prefix. I remember trying something like this a while ago and for some reason it didn't work properl. I think I may have been a bit hasty in dropping the idea and it was likely my lack of understanding of Superset that was the problem. I certainly agree relying on the proxy for the prefix handling is complicated. I'll try again since it certainly seems to just require an environment variable :)
I'm still quite uncomfortable at the necessity to build the frontend to support a path other than root. I haven't made experiments yet on this, but wouldn't it be conceivable that the frontend looks for a config instruction from the backend to get this information ?
Yes this is also annoying. Now I'm a bit more familiar with the code base I wondered about including it in this https://github.com/apache/superset/blob/a986a61b5ff0c98d4e0704a1c2d7016b1e3ec721/superset/views/base.py#L333 config dictionary that is rendered into the templates? Anyone know if this is a good/bad idea?
Happy New Year! 新年快乐!
I've investigated further and found the issue with "Export to PDF" & "Download image" not working on dashboards. The issue seems to be the interaction of the headless browser and
WEBDRIVER_BASEURL. In the docker compose setupsuperset_config.pysuggests settingWEBDRIVER_BASEURL=http://superset_app:8088/, so accessing the superset instance directly and skipping the nginx reverse proxy. This is fine when there is no url prefix set as the paths that both the reverse proxy and the superset app see are the same. When there is a url prefix set it is baked into the javascript files that webpack builds and when accessing the superset app directly the extra prefix part of the path raises an error in the superset app and the screenshot generation fails.我已经进一步调查并发现“导出为 PDF”和“下载图片”在仪表板上无法工作的问题。问题似乎是 headless 浏览器和WEBDRIVER_BASEURL之间的交互。在 docker compose 设置中,superset_config.py建议设置WEBDRIVER_BASEURL=http://superset_app:8088/,因此直接访问 superset 实例并跳过 nginx 反向代理。如果没有设置 URL 前缀,这是可以的,因为反向代理和 superset 应用看到的路径是相同的。当设置了 URL 前缀时,它会被烘焙到 webpack 构建的 javascript 文件中,并且当直接访问 superset 应用时,路径的额外前缀部分会在 superset 应用中引发错误,截图生成失败。A fix/workaround (??) seems to be to set
WEBDRIVER_BASEURL = f"http://superset_nginx{base_path}/"but this also requires removing this with statement that seems to set a test context that doesn't know about the prefix. I'm not sure of the knock on effects of this. @rusackas Would you know someone who could comment on this or possibly suggest an alternative approach?一个修复/解决方案似乎是设置WEBDRIVER_BASEURL = f"http://superset_nginx{base_path}/",但这也需要删除这个 with 语句,该语句似乎设置了一个不知道前缀的测试上下文。我不确定这会有什么副作用。您认识可以对此发表评论或可能提出替代方法的人吗?
@martyngigg try APPLICATION_ROOT, by setting it to a specific perfix, i found it works
I'll try again since it certainly seems to just require an environment variable :) Hi @martyngigg
The trick is, you still need to set the BASE_PATH env var for the frontend (needs for now to be built with the prefix), but not for the superset backend, since this is the WSGI server that handles the prefix.
Here is a patch that should work, to run it with the gunicorn server. I've recently rebased your PR against master, so you might get a bit of noise in the docker-compo: gunicorn_script_name.patch.gz
And a similar patch to run the same principle with the flask dev server. I still haven't looked at a common way to get this to work, but it shouldn't be too complicated (I guess there is some flag somewhere indicating if we run the dev server): flask_devserver_script_name.patch.gz
Yes this is also annoying. Now I'm a bit more familiar with the code base I wondered about including it in superset/views/base.py#L333
bootstrap_dataconfig dictionary that is rendered into the templates? Anyone know if this is a good/bad idea?
Hi @martyngigg from me point of view, it looks like the perfect place, but I'm no Superset developer.
I've been playing around quite a bit on this topic, but am still struggling about dynamic configuration of the path prefix for the frontend (i.e. without dedicated build). Did you make progress on your side ?
Yes this is also annoying. Now I'm a bit more familiar with the code base I wondered about including it in superset/views/base.py#L333
bootstrap_dataconfig dictionary that is rendered into the templates? Anyone know if this is a good/bad idea?Hi @martyngigg from me point of view, it looks like the perfect place, but I'm no Superset developer.
I've been playing around quite a bit on this topic, but am still struggling about dynamic configuration of the path prefix for the frontend (i.e. without dedicated build). Did you make progress on your side ?
Hi @jeanpommier,
I've managed to make a little progress with this. Firstly I've improved the situation with the backend such that its no longer the reverse-proxy that strips the prefix. I've added a middleware on the Flask app that does it so now the prefix is understood by the application so it works with the Flask dev server, gunicorn and Nginx. Changeset
Secondly, I've got a work in progress commit that attempts to push the application_root through the bootstrap data now that the app knows about it. It mostly works with the exception of issues with stylesheet imports through webpack that generate the wrong path. I need to explore further but the webpack docs suggest __webpack_public_path__ might be a way to go.
Korbit doesn't automatically review large (500+ lines changed) pull requests such as this one. If you want me to review anyway, use /korbit-review.
Not a review, but at first look, it looks really neat. And quite some work done here !
Loosely related -> https://github.com/apache/superset/pull/31996
Hi @martyngigg I believe you might have missed some places in the frontend: This is kind of a hedge case, but if you edit a dashboard, go to "save as", select overwrite, it is redirecting to /superset/dashboard/yourdashslug after you save a dashboard edit (no prefix anymore)
I believe it might be due to https://github.com/martyngigg/superset/blob/allow_prefixed_paths/superset-frontend/src/dashboard/actions/dashboardState.js#L410 messing up the browser history also, at some point.
Reproduce it:
- edit a dashboard
- save as
- overwrite
- you get redirected to the prefix-less path
Hi @martyngigg I believe you might have missed some places in the frontend: This is kind of a hedge case, but if you edit a dashboard, go to "save as", select overwrite, it is redirecting to /superset/dashboard/yourdashslug after you save a dashboard edit (no prefix anymore)
I believe it might be due to https://github.com/martyngigg/superset/blob/allow_prefixed_paths/superset-frontend/src/dashboard/actions/dashboardState.js#L410 messing up the browser history also, at some point.
Reproduce it:
* edit a dashboard * save as * overwrite * you get redirected to the prefix-less path
Thanks! I think I've fixed this now.
Hello All,
I am trying to run the superset with these changes in my local with docker-compose.yml and ran into some problems. It would be helpful if you have any insights on this.
I tried with both SUPERSET_APP_ROOT with both default "/" and a non-root path "/analytics/reports".
But i get below login screen and welcome screen which is not real screen.
Should i set any other configs to get this working? Thanks in advance for your help
Update: Hi I got it working by using another docker-compose-non-dev.yml .
