prisma icon indicating copy to clipboard operation
prisma copied to clipboard

Receiving `Can't reach database server at...` error periodically

Open alexghattas opened this issue 6 months ago • 9 comments

Bug description

In our production instance, we are periodically receiving the following error, this happens a handful of times per day, the prisma query that errors out differs. The app is up and running, connect to the DB, it works as expected majority of the time, but this error still periodically comes in.

Invalid prisma.answer.findMany() invocation: Can't reach database server at db.url:6543 Please make sure your database server is running at db.url:6543.

How to reproduce

Connect Prisma to a cloud PostgreSQL database using a pooler (pgBouncer)

Expected behavior

No response

Prisma information

This is what our connection string looks like: postgresql://postgres:password@url:6543/postgres?pgbouncer=true&connection_limit=5&pool_timeout=30&sslmode=verify-full&sslcert=certurl.crt

We are using Prisma version: 5.5.2

Environment & setup

  • Database:Postgresql version 15.1.0.103
  • Node.js version: v16.20.2

Prisma Version

5.5.2

alexghattas avatar Dec 12 '23 19:12 alexghattas

Hi @alexghattas, do you happen to have some monitoring system set up on your database, to ensure it is effectively always up? Is the database hosted in a Docker container that may restart?

Also, does the same problem occur with [email protected]? Thank you.

jkomyno avatar Dec 14 '23 11:12 jkomyno

Hi @alexghattas, do you happen to have some monitoring system set up on your database, to ensure it is effectively always up? Is the database hosted in a Docker container that may restart?

Also, does the same problem occur with [email protected]? Thank you.

@jkomyno thank you for your response!

  • We do have a monitoring system set up (our DB is hosted with Supabase), and I don't see any issues in there that line up with when these errors get thrown (monitoring them with Sentry)
  • We have not tested with [email protected], since this is only happening in our production instance, we would need to make that change and see if it resolves it, is that something you would recommend?

Could this also be an issue with the node version we are running on backend server on where prisma is running?

I believe it is currently running [email protected].

alexghattas avatar Dec 14 '23 18:12 alexghattas

Hi @alexghattas, do you happen to have some monitoring system set up on your database, to ensure it is effectively always up? Is the database hosted in a Docker container that may restart?

Also, does the same problem occur with [email protected]? Thank you.

@jkomyno I updated Prisma to 5.7 and node to 20, and the issue still persists, unfortunately.

alexghattas avatar Dec 19 '23 02:12 alexghattas

I'm facing the some situation. using prisma ^5.7.1 throught pgbouncer to pg on azure. Always happening when set connection_limit large than 100. pgbouncer shows: closing because: client unexpected eof.

danielwii avatar Dec 30 '23 12:12 danielwii

Same issue here on production set up since upgrading to 5.7.1 from 5.1.0. Connecting to Postgres RDS (AWS). Connect string is postgresql://VERY:[email protected]/nogin?connection_limit=5000 Errors as follows:

Timed out fetching a new connection from the connection pool. More info: http://pris.ly/d/connection-pool (Current connection pool timeout: 10, connection limit: 5000)
Can't reach database server at `FOO.rds.amazonaws.com`:`5432`
Please make sure your database server is running at `BAR.rds.amazonaws.com`:`5432`.

Attempting to add &pool_timeout=30&connect_timeout=30 to the connection string to see what that does to things but it is definitely timing-wise related to the prisma upgrade.

Same issue here on production set up since upgrading to 5.7.1 from 5.1.0. Connecting to Postgres RDS (AWS). Connect string is postgresql://VERY:[email protected]/nogin?connection_limit=5000 Errors as follows:

Timed out fetching a new connection from the connection pool. More info: http://pris.ly/d/connection-pool (Current connection pool timeout: 10, connection limit: 5000)
Can't reach database server at `FOO.rds.amazonaws.com`:`5432`
Please make sure your database server is running at `BAR.rds.amazonaws.com`:`5432`.

Attempting to add &pool_timeout=30&connect_timeout=30 to the connection string to see what that does to things but it is definitely timing-wise related to the prisma upgrade.

I think I agree that its an issue with Prisma, I was doing a couple of different things when we upgraded Prisma to 5.7, but the issue persists, even after working with the team that provides our pooler

Have you tried downgrading back to 5.1 to see if it fixes it?

alexghattas avatar Jan 09 '24 18:01 alexghattas

Same issue here on production set up since upgrading to 5.7.1 from 5.1.0. Connecting to Postgres RDS (AWS). Connect string is postgresql://VERY:[email protected]/nogin?connection_limit=5000 Errors as follows:

Timed out fetching a new connection from the connection pool. More info: http://pris.ly/d/connection-pool (Current connection pool timeout: 10, connection limit: 5000)
Can't reach database server at `FOO.rds.amazonaws.com`:`5432`
Please make sure your database server is running at `BAR.rds.amazonaws.com`:`5432`.

Attempting to add &pool_timeout=30&connect_timeout=30 to the connection string to see what that does to things but it is definitely timing-wise related to the prisma upgrade.

I think I agree that its an issue with Prisma, I was doing a couple of different things when we upgraded Prisma to 5.7, but the issue persists, even after working with the team that provides our pooler

Have you tried downgrading back to 5.1 to see if it fixes it?

Fixed the issue by reducing the connection_limit size - prisma was creating a different connection pool for each instance so we had 5000 x 5, which exhausted the database limit. Changing the connection string to connection_limit=100 fixed this issue.

A "fix" for our issue, was to add a new query parameter to the database URL of connect_timeout=300.

Not sure if its a solution, or a band-aid for now, but it prevents the errors from coming through.

alexghattas avatar Feb 01 '24 21:02 alexghattas

I am facing the same issue and the issue has started since we have moved to arm architecture for our deployment and the fix of increasing connection_timeout and pool may work for a short time but in the long run I am not so sure

Abhinav0449 avatar Apr 29 '24 10:04 Abhinav0449