Arthur Petukhovsky

Results 29 issues of Arthur Petukhovsky

We should move away from calling our current WAL uploading to remote storage like a "backup" in code/comments. The segment in remote storage can be the only copy of the...

c/storage/safekeeper
a/tech_debt

In these tests `wait_lsn_timeout` was set to `1s` which was too low. If something blinks (skps connection, sk activation, broker) during the test, the test will likely fail because of...

We have an issue that some partial uploaded segments can be actually missing in remote storage. I found this issue when was looking at the logs in staging, and it...

The error means that manager exited earlier than `ResidenceGuard` and it's not unexpected with current deletion implementation. This commit changes log level to reduse noise.

Splitted from https://github.com/neondatabase/autoscaling/pull/1013#discussion_r1684702324

Arthur's prototype from January: https://github.com/neondatabase/neon/tree/sk-sharding-stream ## Precursors We may start by refactoring pageserver WAL ingest code to make decoding WAL records more independent of Timeline. ```[tasklist] ### Tasks - [...

t/bug
c/storage/safekeeper

## Steps to reproduce Start a walproposer with an invalid config (duplicate safekeeper) in `postgresql.conf`, like: ``` neon.safekeepers = 'safekeeper-1.us-east-2.aws.neon.build:6401,safekeeper-3.us-east-2.aws.neon.build:6401,safekeeper-1.us-east-2.aws.neon.build:6401' ``` This is not a new problem, but I haven't...

t/bug
c/storage/safekeeper
triaged

Found errors like this in the logs: ``` starting upload PartialRemoteSegment { status: InProgress, name: "000000010000000000000002_103_0000000002000000_0000000002000000_sk347.partial", commit_lsn: 0/2000000, flush_lsn: 0/2000000, term: 103 } failed to upload 000000010000000000000002_103_0000000002000000_0000000002000000_sk347.partial: Failed to open...

t/bug
c/storage/safekeeper
triaged

## Motivation Currently we use a standard postgres way to download WAL – `START_REPLICATION` described in the postgres docs https://www.postgresql.org/docs/current/protocol-replication.html The main problem with it is scalability, because each connection...

c/storage/safekeeper
a/scalability
a/performance