Erik Grinaker
Erik Grinaker
`package.json` says MIT, would be nice with a `LICENSE` file as well.
In #110, we made some minor changes to snapshot handling, which removed the main use of `Progress.PendingSnapshot`. As described in https://github.com/etcd-io/raft/pull/110#discussion_r1405420442, we should now be able to remove `PendingSnapshot` completely,...
In #79, we added `StepDownOnRemoval`, which makes a leader step down to follower after removing itself from the group or demoting itself to learner. Without this, the old leader may...
This was deprecated in #62, and should be removed in the next major release.
## Problem We don't have any observability for Safekeeper WAL receiver queues. Resolves #9328. ## Summary of changes Add a `safekeeper_wal_receiver_queue_depth` histogram for queue depths per WAL receiver. Sampled once...
## Problem The `WalAcceptor` main loop currently uses two nested loops to consume inbound messages. This makes it hard to slot in periodic events like metrics collection (e.g. #9328). It...
When shipping WAL from safekeepers to pageservers, we presumably buffer the WAL records. We should have a metric tracking how much WAL is buffered or in-flight. Ideally per timeline, but...
## Problem In #9259, we found that the `check_safekeepers_synced` fast path could result in a lower basebackup LSN than the `flush_lsn` reported by Safekeepers in `VoteResponse`, causing the compute to...
We should measure WAL ingest throughput -- both for pageserver/safekeepers in isolation, and entire pipeline. Both for local tests, and also with remote storage in the loop. Objective: find ingest...
After #9337, when shards restart and need to catch up on old WAL, each shard will pull WAL records from S3 and filter them. This results in O(catchup_ranges) work. We...