healthcheck
healthcheck copied to clipboard
Liveness check - database ping issue
Hello,
I'm trying to make use of this library to implement some healthchecks in my application.
My liveness check consists of a database ping check like this:
health.AddLivenessCheck("database-ping", healthcheck.DatabasePingCheck(db, 2*time.Second))
But given that the database.Ping()
returns an error only on the initial request (i.e. if the database goes down at some point, it will not return an error) the /live
endpoint request returns 200 OK
even if the database is down.
Are there any plans to fix this, perhaps by using database.Query("SELECT 1")
instead of database.Ping()
?
Thanks a lot.
The behavior you're seeing is surprising to me. According to the docs, the Ping()
method is supposed to verify that the connection is still alive. Which database driver are you using? I see some drivers implement Ping()
a little differently:
As a workaround, you could write your own SELECT 1
check pretty concisely:
health.AddReadinessCheck("database-select", func() error {
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
res, err := database.ExecContext(ctx, "SELECT 1")
if err != nil {
return err
}
rows, err := res.RowsAffected()
if err != nil {
return err
}
if rows != 1 {
return fmt.Errorf("incorrect DB result (expected 1 row, got %d)", rows)
}
return nil
})
If this is a widespread problem, I'd be happy to add a check like that into this package.
One other note: you likely want to use a readiness check rather than a liveness check for your upstream database. If the database is unreachable, you don't want to serve requests but you don't necessarily want your pod to be restarted.
@mattmoyer I came across this issue. database.Ping()
doesn't work for me. I use lib/pq.
Maybe the reason is this issue: https://github.com/lib/pq/issues/533 in lib/pq.
Anyway, I use your the code from your comment and it works for me. Thank you!