polis
polis copied to clipboard
local polis-file-server sometimes fails to serve files; polis-server crashes
Expected behavior: Polis-file-server always serves static files, or polis-server can handle a single connection drop. Polis-server does not crash.
Actual behavior: Polis-file-server sometimes drops the connection; polis-server crashes
To Reproduce: Deploy both polis-server and polis-file-server in production, running polis-file-server as 'local' upload. Wait several hours, then access a page on polis-server.
Screenshots:

Device information:
- AWS t2.small running docker-compose
Additional context: Logs from polis-server:
part2
{}
{
'x-forwarded-proto': 'https',
host: 'polis.client.newredo.com',
connection: 'close',
'user-agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:86.0) Gecko/20100101 Firefox/86.0',
accept: 'image/webp,*/*',
'accept-language': 'en-GB,en;q=0.7,en-US;q=0.3',
'accept-encoding': 'gzip, deflate, br',
referer: 'https://polis.client.newredo.com/',
cookie: 'REDACTED'
}
/app/node_modules/http-proxy/lib/http-proxy/index.js:120
throw err;
^
Error: socket hang up
at connResetException (internal/errors.js:617:14)
at Socket.socketCloseListener (_http_client.js:443:25)
at Socket.emit (events.js:327:22)
at Socket.EventEmitter.emit (domain.js:486:12)
at TCP.<anonymous> (net.js:673:12)
at TCP.callbackTrampoline (internal/async_hooks.js:129:14) {
code: 'ECONNRESET'
}
After this polis-server crashes. I have the docker container set to restart: unless-stopped, and it comes back up after that so the next bit is just init 1. The page loads on refresh.
The cause of this is that routingProxy sometimes raises an ECONNRESET. Because there are no listeners -- e.g.:
routingProxy.on('error', (e) => {
...
})
this error propagates and crashes the application.
I think this is https://github.com/http-party/node-http-proxy/issues/1455
This is great sleuthing @midgleyc! much appreciated! Sorry, there are not many outside people trying to host Polis in production-like environments.
@joshsmith2 are you hitting something like this as well?
@patcon The application does not terminate as such but instead it logs the error, then keeps running with the broken socket and does not respond to further requests. It's unclear why the process does not terminate but perhaps the use of setInterval and long setTimeout calls is related. As the process does not terminate, application supervisors (like docker) don't know to restart it.
I'd like to fix this by making sure the application terminates properly, with an error code, as this will provide a wide solution covering other failure scenarios. Any objections or other ideas?
No objection from me, but to be clear, I don't have merge access directly. I can't imagine the polis team would have issue so long as the solution is scoped and as minimal as feasible :) Feel free to talk it out loud here
I don't have much to add [other than my appreciation to those who have discussed beforehand], but wanted to confirm that we are hitting this issue on our polis deployment in what appears to be a similar configuration to @midgleyc. Happy to contribute to some possible fixes.