node-postgres icon indicating copy to clipboard operation
node-postgres copied to clipboard

Bug: node postgres forever hangs when the network service is turned off on server.

Open mai1x9 opened this issue 2 years ago • 4 comments

pool when created, hangs forever and never attempts to reconnect to server, when the network service on the server rnning postgres is turned off. On client, create a pool, once the pool is ready connected to postgres, run below command on Server running postgres

nmcli networking off

Now any query executed using pool.query(), throws error: timeout exceeded when trying to connect. However when the server is up again running

nmcli networking on

the pool.query() still timeouts and never reattempts to connect to server. How do we know if really the server is done. From the error timeout exceeded when trying to connect one cannot conclude that server is down, because, if all clients within the pool are busy running the quries, and the queued queries will throw timeout exceeded when trying to connect error if it could not able to acquire the client from the pool. In this case the server is up, but its the pool which is busy and couldn't not acquire client hence the error. The same error is thrown, when the server networking is turned off. The question is how would one precisely know when the server is really down when the clients within the pool is already connected to server?

mai1x9 avatar May 09 '23 11:05 mai1x9

@mai1x9 I also saw similar issues in production during failover events, however I couldn't reproduce locally. Do you have reproduction steps including a program and a docker compose stack?

panga avatar Jan 13 '25 20:01 panga

@mai1x9 faced with the same problem It happened during the postgres restart I'm trying to reproduce the problem, but so far without success

perhaps timeouts can be used as a solution to the problem:

type Config = {
  statement_timeout?: number, // number of milliseconds before a statement in query will time out, default is no timeout
  query_timeout?: number, // number of milliseconds before a query call will timeout, default is no timeout
  lock_timeout?: number, // number of milliseconds a query is allowed to be en lock state before it's cancelled due to lock timeout
}

stas-ut21 avatar Feb 19 '25 14:02 stas-ut21

@panga and @stas-ut21 I haven't fixed the issue, we are fully migrated to sequelize. I however did not check if sequelize has same issue (as it uses same underlying node postgres).

For now I went on belief that the issue won't occur, but it's worrying factor that someone seen in production.

I personally suggest go with sequelize and migration is quite easy. Otherwise help from project maintainer os required and its bad that no one yet addressed.

My hopes are up so went with sequelize.

mai1x9 avatar Feb 19 '25 19:02 mai1x9

the problem occurs due to connection freezes in the pool query may hang if there are dns, proxy, etc. services it is necessary to set all timeouts

@mai1x9 if I'm not mistaken, the sequelize does not use pools from the this module

stas-ut21 avatar Mar 10 '25 13:03 stas-ut21