superset
superset copied to clipboard
Superset app loads infinitely or throws unexpected error occasionally. Ver. 3.1.2
Bug description
This is the 3.1.2 iteration of the this issue While using the superset app, I sometimes face two issues which might be related:
- The superset app is stuck on the loading screen until refreshed manually.
- When a dashboard is opened it gives an unexpected error.
How to reproduce the bug
Host the latest version of the superset app deployed using docker, in an EC2 instance. Connect the superset with a database and create multiple dashboards. Use it continuously for a while. In our case this issue can be raised after the redeployment process.
Screenshots/recordings
SQL Lab example:
Dashboards tab example:
Superset version
3.1.2
Python version
3.9
Node version
16
Browser
Chrome
Additional context
No response
Checklist
- [X] I have searched Superset docs and Slack and didn't find a solution to my problem.
- [X] I have searched the GitHub issue tracker and didn't find a similar bug report.
- [X] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.
It's not able to load a chunk of the javascript built by webpack. I'm not sure how you're deploying, but is npm run build
part of the process, if you're making any frontend changes? That compiles all the JS/TS into bundles as a static asset for Superset to serve. I suspect somehow your deployment is only getting part of the JS updated somehow.
It's not able to load a chunk of the javascript built by webpack. I'm not sure how you're deploying, but is
npm run build
part of the process, if you're making any frontend changes? That compiles all the JS/TS into bundles as a static asset for Superset to serve. I suspect somehow your deployment is only getting part of the JS updated somehow.
Yep, @rusackas npm run build
is there. I've mentioned deployment as one of the factors, but these chunk errors are met without deploy as well.
This is pointing towards a backend/frontend being out of sync. The backend is made aware of a manifest at /static/assets/manifest.json , and there many reasons why this can/could get out of sync (though it's not common AFAIK). But normally if you're on a clean npm run build
where manifest is defined and published, and a fresh backend that read the latest manifest, you should be fine.
I'm unclear what happens if the client (browser is out of sync) as it may ask for an old/expired asset while hot-loading (?). I'm guessing at Preset we may persist the older build chunks on the CDN or have some other way to tell the client to refresh its manifest
In your case does a client force-refresh fix the issue? Could you have zombie pods serving an older version of the app?
Hey @mistercrunch and @rusackas . I think I've found the solution for my case.
I've added this code snippet into the superset-frontend/src/setup/setupApp.ts
file.
window.addEventListener('error', e => {
// prompt user to confirm refresh
if (/Loading chunk [\d]+ failed/.test(e.message)) {
window.location.reload();
}
});
Can be done as PR by myself if needed.
Looks like a band-aid. It makes me wonder how Superset manages a new manifest and hot loading of obsolete assets (?) If we don't do this already it seems we should generally look for a manifest version and force a refresh of the manifest/assets when the client is behind. I'm wondering if there's anything like that in the code base (?)
@Sabutobi do you only see this on new deploys?
If it is the client caching pointers to chunks that no longer exist on the server, it does seem like we might have a few options
- tell the browser to refresh itself on error (the above suggestion) so it becomes aware and loads chunks... but that does seem glitch-prone in its own way, and does feel like a bandaid.
- Keep older chunks around in a directory, with a certain TTL on them that's longer than the client cache. This seems like the safest/smoothest option I can think of.
- Have the browser load from cache or not based on data in the JWT. When the site is redeployed, the JWT could effectively invalidate the cache. Probably not straightforward with a SPA... it'd have to do a check on route change or something.
- Set up a health check endpoint (I think we might already have one, actually) that responds with a build SHA/UUID. We could check that as a polling operation and/or on SPA route change, and force a reload when it changes... but this also feels like a weird bandaid.
This feels like there should be a best practice, and we're reinventing a wheel here. I'll try to do some digging.
Health check seems would be best and could handle a variety of things:
- check auth hasn't expired and potentialy try to refresh the token if it's implemented, otherwise send to login screen
- check manifest version (frontend/backend synchronicity) and either refresh the manifest client-side of tells the user to refresh the page or do it on their behalf
Health check seems would be best and could handle a variety of things:
- check auth hasn't expired and potentialy try to refresh the token if it's implemented, otherwise send to login screen
- check manifest version (frontend/backend synchronicity) and either refresh the manifest client-side of tells the user to refresh the page or do it on their behalf
Thanks for the fresh ideas. I'll try to implement that somehow...
Are we sure this is a hot-loading issue (where js bundles are loaded dynamically)? If so it could just be we need better error handling on hot-load and that's it. Like if we get a 404 on hot-load, then you could check the manifest version and pop something to the user "The frontend has gone out of sync from the backend, refresh? Refresh / Cancel"
Just chiming into say that we are getting reports and screenshots of this exact same error (we are on 3.0.2). I have nothing constructive to add here, other than validation that @Sabutobi is not the only one!
Possibly relevant to the questions posed above:
- Yes we are building our own Docker image that just inherits from
apache/superset
image to overwrite our ownsuperset_config.py
; it makes no frontend changes. - We deploy on Azure App Service, using a burstable instance, which, anecdotally, I feel like may occasionally serve up from older images when cold start.
Hey y'all. I know that's not the solution,but:
const origin = console.error;
console.error = error => {
if (/Loading chunk [\d]+ failed/.test(error.message)) {
alert(
'A new version released. Need to relaod the page to apply changes.',
);
window.location.reload();
} else {
origin(error);
}
};
I've added this code snippet into the superset-frontend/src/setup/setupApp.ts
file additionally and I do have this now
It's better than nothin'! I guess if it doesn't resolve the problem (like the chunk is really not there) then you're stuck in a loop... but that's an edge case to an edge case, and it's kinda busted either way then.
If it is the client caching pointers to chunks that no longer exist on the server, it does seem like we might have a few options
- tell the browser to refresh itself on error (the above suggestion) so it becomes aware and loads chunks... but that does seem glitch-prone in its own way, and does feel like a bandaid.
- Keep older chunks around in a directory, with a certain TTL on them that's longer than the client cache. This seems like the safest/smoothest option I can think of.
- Have the browser load from cache or not based on data in the JWT. When the site is redeployed, the JWT could effectively invalidate the cache. Probably not straightforward with a SPA... it'd have to do a check on route change or something.
- Set up a health check endpoint (I think we might already have one, actually) that responds with a build SHA/UUID. We could check that as a polling operation and/or on SPA route change, and force a reload when it changes... but this also feels like a weird bandaid.
This feels like there should be a best practice, and we're reinventing a wheel here. I'll try to do some digging.
Hi @rusackas , @mistercrunch will the solution be part of a future release?
Yup open a PR and we can see what it'd take to push this through. Some things that would be nice:
- having a way to assess more clearly that it's a version/manifest issue, maybe there's another call to compare versions/manifest, and get more metadata about it
- an antd-powered modal popup, with relevant information as to what exactly is happening
- maybe a more targetted approach hooked to a hot-loading event or callback.
setupApp.ts
may not be the most targeted approach ...
Hey @rusackas and @mistercrunch . FYI: The PR has been opened
Since upgrading to 4.x, the JS Chunk load errors have become so severe we had to downgrade to latest 3.x again. (We have several dozen users and use auto-refreshing dashboards alot)
the JS Chunk load errors have become so severe
I thought the JS Chunk load errors would be corollated to [how many users have an active browser open updates] * [how often you deploy]. Any other theory ln why they would be more frequent in your context? Partly defective or overwhelmed CDN/server * number of chunks to serve?