John Spray
John Spray
I'd still like metrics -- searching logs is a relatively fragile way of alerting compared with a counter. I'm not sure how big of an ask this is though: do...
I suggest we get it into the "Cross service endpoint debugging" dashboard before pausing -- let's make sure that next time we're investigating an unhappy database, these metrics are visible...
Looked back 30 days, there was one other case of this on 2024-12-07: https://neon-github-public-dev.s3.amazonaws.com/reports/main/12213126318/index.html#/testresult/4fed18972c23e9c0
@arpad-m for PRs that will fail a bunch of tests, you might want to borrow this commit: https://github.com/neondatabase/neon/pull/9537/commits/8f943d9c1f6344c078b963df257b7631d38c9623
@ololobus please can you consider the size/priority of this
(Thanks for the approval - going to merge later to avoid conflicts with other things)
@dimitri looks like this needs a retry on CI for it to merge (click dots next to failure, view details, "Rerun jobs")
> Do we want metrics on semaphore wait queue length or are higher-level metrics sufficient? If it's easy, then a queue depth stat is a nice thing to have in...
In https://github.com/neondatabase/neon/pull/7766 the `id` moves out of config and into the identity file. Since the `identity.toml` is written externally, we still need some file written by the pageserver itself that...
> I can totally foresee us needing to change the control_plane_api url. Yes: this proposal doesn't prevent that, but it requires it to be done in a very explicit way:...