kine icon indicating copy to clipboard operation
kine copied to clipboard

Skip logging of ErrKeyExists during insert

Open mhmxs opened this issue 5 months ago • 11 comments

This change eliminates logging of duplicate key value violates unique constraint "kine_name_prev_revision_uindex" (SQLSTATE 23505) (in case of postgresql). There are two reasons to do this:

  • ErrKeyExists is a special error, it is not logged when happens on mysql for example.
  • ErrKeyExists suspects client error, and handled by client, so it makes sense to skip at Kine side.

mhmxs avatar May 20 '25 08:05 mhmxs

@yegortokmakov I believe you JUST started logging this in https://github.com/k3s-io/kine/pull/464 - can you provide some commentary on why you found this useful?

brandond avatar May 20 '25 22:05 brandond

hi both and sorry for the slow response! Indeed, I just added it in the last release. We did a bunch of optimizations to Postgres and to Kine (using the new parameters in the same release) and now these and other warning and error logs are gone.

Its true these specific messages are not actionable directly and in most cases won't affect the end performance of the API, but they are definitely a symptom of something wrong in the setup. Most likely, it is postgres / kine performance.

update: In my opinion, we should keep the log but I agree there should be an explanation somewhere what it means and some suggested next steps.

yegortokmakov avatar May 27 '25 08:05 yegortokmakov

they are definitely a symptom of something wrong in the setup

For example, Kubernetes E2E tests generates lots of messages. The tests are handling those failures, so they don't have any purpose, except it is harder to read the logs.

mhmxs avatar May 27 '25 08:05 mhmxs

If you're only seeing errors on e2e tests that specifically do bad things, then I'm less likely to merge a PR that just churns the logging back off again.

brandond avatar May 27 '25 18:05 brandond

If you're only seeing errors on e2e tests that specifically do bad things, then I'm less likely to merge a PR that just churns the logging back off again.

@brandond I'm a developer, so the only workload i run are tests. I don't think the E2E test does anything wrong. It just handle already exists error. What i can say, earlier it didn't log this messages, and now it is annoying to read them, because they are client side issues and not Kine errors :)

mhmxs avatar May 27 '25 19:05 mhmxs

Clearly some folks want them on and some folks want them off. I don't have a strong feeling either way, although I lean towards not turning them off because well... tests are expected to be bit noisy when they're doing bad things, and in production it'd be good to know about problems.

Maybe instead of undoing what was just done, you could add a flag or env var for this?

brandond avatar May 27 '25 20:05 brandond

they are definitely a symptom of something wrong in the setup

For example, Kubernetes E2E tests generates lots of messages. The tests are handling those failures, so they don't have any purpose, except it is harder to read the logs.

What's your setup for the tests? Do you have an example I can try to reproduce? I'm curious what can cause it.

In the end, it might be that the Postgres instance used in the test suite doesn't have enough resources to keep up.

yegortokmakov avatar May 27 '25 23:05 yegortokmakov

Clearly some folks want them on and some folks want them off.

In that case, i suggest to do this logging on other databases too, because in this way it behaves differently.

mhmxs avatar May 28 '25 07:05 mhmxs

Postgres is the only database engine we support that has wonky auto-increment PK semantics. The error simply doesn't happen on other databases.

It basically increments a counter in a transaction and then uses that as the new value. If multiple clients try to insert at the same time, they will both get the same new ID, and one of the transactions will fail due to the unique constraint. On all other engines the PK can be incremented atomically without potentially causing a conflict.

brandond avatar May 28 '25 07:05 brandond

The error simply doesn't happen on other databases

I see, but the error i have seen was related to kine_name_prev_revision_uindex and not PK collision. That unique constraint is existing on mysql too. The log message is wrapped by err != nil, so it logs every error. But if you check dialect.TranslateErr of both mysql and pgsql, it translates unique constraint errors to server.ErrKeyExists, and these errors are not logged in Kine. But this log message doesn't take care on translating of errors, so it logs every error, but only on pgsql.

mhmxs avatar May 28 '25 19:05 mhmxs

I see, but the error i have seen was related to kine_name_prev_revision_uindex and not PK collision.

Yes, that's the unique constraint violation I was referring to.

brandond avatar May 28 '25 21:05 brandond

Closing, as I don't think there was a consensus that this is something we want to turn back off again.

brandond avatar Oct 10 '25 21:10 brandond