neon icon indicating copy to clipboard operation
neon copied to clipboard

pageserver: slow basebackup when there are too many aux files

Open skyzh opened this issue 1 year ago • 1 comments

https://neondb.slack.com/archives/C03438W3FLZ/p1728330685012419

The devprod team has a testing project that cannot start because basebackup is too slow. Pageserver takes ~5min to scan all aux files, and no data is sent over the wire protocol, causing compute timeout.

skyzh avatar Oct 17 '24 15:10 skyzh

for the project in staging, decide to extend start timeout so that at least it could start

skyzh avatar Oct 20 '24 19:10 skyzh

this week: continue investigation

skyzh avatar Oct 28 '24 14:10 skyzh

Running the previous aux file pagebench does not see any perf regressions with 10k files cargo run --bin pagebench aux-files.

So I guess this tenant experiencing slow basebackup is more related to overwriting existing aux key-value pairs / due to inefficiency in layer structure, instead of vectored read path perf issues. Need to look into the layer map and reproduce it.

skyzh avatar Oct 28 '24 20:10 skyzh

for this tenant, no aux image file was generated since LSN 00000210492E07B8 (generation 000001fa).

skyzh avatar Oct 29 '24 16:10 skyzh

By force running manual compaction, this can be resolved -- I can run this in staging while keep investigating locally. Seems that aux compaction is not triggered correctly in this specific case.

skyzh avatar Oct 29 '24 17:10 skyzh

Basebackup is unstuck for that project (temporarily), compute cannot start due to wal_level (compute team taking over to investigate) -- still need to investigate why image layers are not generated.

skyzh avatar Oct 29 '24 19:10 skyzh

this week: investigate if we can improve read path by not tracking keys on sparse keyspace

skyzh avatar Nov 04 '24 14:11 skyzh

another interesting observation is that the aux file layers are usually super small after L0->L1 compaction:

-rw-r--r--   1 skyzh  staff   224K Nov  4 15:46 030000000000000000000000000000000001-62000002011CDE14934B1DC19112DDCD798B__00000216783D4AC9-0000021678403A51-v1-000001fc
-rw-r--r--   1 skyzh  staff   224K Nov  4 15:46 030000000000000000000000000000000001-62000002011CDE14934B1DC19112DDCD798B__0000021678403A51-00000216784329F1-v1-000001fc
-rw-r--r--   1 skyzh  staff   224K Nov  4 15:46 030000000000000000000000000000000001-62000002011CDE14934B1DC19112DDCD798B__00000216784329F1-0000021678461551-v1-000001fc
-rw-r--r--   1 skyzh  staff   224K Nov  4 15:46 030000000000000000000000000000000001-62000002011CDE14934B1DC19112DDCD798B__0000021678461551-00000216784904F1-v1-000001fc
-rw-r--r--   1 skyzh  staff   208K Nov  4 15:46 030000000000000000000000000000000001-62000002011CDE14934B1DC19112DDCD798B__00000216784B9FC9-00000216784E52A1-v1-00000200
-rw-r--r--   1 skyzh  staff   224K Nov  4 15:27 030000000000000000000000000000000001-62000002011CDE14934B1DC19112DDCD798B__000002207A723049-000002207A751441-v1-00000209
-rw-r--r--   1 skyzh  staff    24K Nov  4 15:27 030000000000000000000000000000000002-630000000000000000000000000000010000__00000210492E07B8-v1-000001fa

skyzh avatar Nov 05 '24 14:11 skyzh

we have two fixes: make read path faster, and make compaction more aggressive (while it shouldn't affect the amplification b/c the aux files are really small after making the updates into deltas)

skyzh avatar Nov 05 '24 21:11 skyzh