Mikhail

Results 91 comments of Mikhail

@jonch070 I think i fixed it, have a look

Thanks for the contribution! It seems like we have already made the changes independently, so closing this PR.

@jcsp Given TCP overrides are in prod now, and single endpoint max pageserver wait time is capped at approximately 10 minutes[^1], is this task still active? I've added https://neonprod.grafana.net/goto/XNJ-wE1Ng?orgId=1 visualization...

Ah, so I misunderstood the request, it's not about metrics _visualization_ but rather _adding_ metrics into i.e. compute_ctl. Ok, this seems reasonable, I'll have a look how this can be...

Ratio then can be calculated using getpage_stuck_requests_total / getpage_sync_requests_total

This week: - Plan to merge https://github.com/neondatabase/neon/pull/11710 (waiting for review by @ololobus) Other chunks of work are blocked till service is merged and deployed, optimistic plan is this week's compute...

@jcsp Given metrics are now on prod, "stuck getpage" grafana dashboard features them, and we have a Communicator project ongoing which will probably reevaluate our metrics of such corner cases,...

Added getpage metrics to the dashboard (link in child ticket) and verified the metrics are correct

Agreed, let's wait till next release

Regarding garbage collector component: 1. I can't think of a solution where garbage collectors run in parallel. If a GC on replica tries processing a log entry range [X; Y),...