neon
neon copied to clipboard
Epic: set up infra for periodic perf tests with bigger databases
Right now we have some infrastructure to run periodic tests, but it needs more love. For some amount of time safekeepers were not capable of automatically deleting WAL, so we avoided big databases as they were causing disk overflow on staging. Now safekeepers can clean their WAL so let's start doing more realistic tests.
As result we want to have the following:
- Grafana overview dashboard with one panel for each test -- panel shows each test result as a single number (seconds or transactions) against test competition time.
- Links to per-timeline dashboard for each test run.
Approximate steps:
- [x] bring existing periodic perf tests to life: https://github.com/neondatabase/neon/pull/2037
- [x] update runner (or use k8s runners): https://github.com/neondatabase/neon/pull/2037
- [ ] put results to s3 instead of git (or it is stored in postgres? -- find out)
- [ ] increase scale up to 300GB
- [ ] increase run time to a few hours: https://github.com/neondatabase/neon/pull/2077
- [ ] set up / fix grafana dashboards
https://github.com/neondatabase/neon/issues/2221 is required
during the meeting today @hlinnaka mentioned that we're ready to increase the scale up more towards the 300GB, e.g. 50GB first?
is this still blocked by neondatabase/neon#2221 ?
during the meeting today @hlinnaka mentioned that we're ready to increase the scale up more towards the 300GB, e.g. 50GB first?
I've experemented with pgbench -i, setting up a 50 GB (scale 3424) database took ~ 1h (with enabled prefetch):
------------------------------ Benchmark results -------------------------------
test_pgbench_remote_init[3600-3424].scale: 3424
test_pgbench_remote_init[3600-3424].init.start_timestamp: 1665488570
test_pgbench_remote_init[3600-3424].init.end_timestamp: 1665492349
test_pgbench_remote_init[3600-3424].init.duration: 3,778.830 s
test_pgbench_remote_init[3600-3424].init.drop_tables: 0.030 s
test_pgbench_remote_init[3600-3424].init.create_tables: 0.130 s
test_pgbench_remote_init[3600-3424].init.client_side_generate: 1,419.680 s
test_pgbench_remote_init[3600-3424].init.vacuum: 1,607.980 s
test_pgbench_remote_init[3600-3424].init.primary_keys: 750.590 s
https://github.com/neondatabase/neon/actions/runs/3226525705/jobs/5280150287
A couple of clarifying questions:
- Should we replace 10GB test with 50GB or run it in addition? We could run them alternately (one day 10GB, next day 50GB)
- Currently we don't have 50 GB project for reusing. Should I prepare it?
@hlinnaka what do you think?
is this still blocked by https://github.com/neondatabase/neon/issues/2221 ?
Now with creating projects via API it doesn't really blocked by it. I believe with increasing db size we could be blocked by https://github.com/neondatabase/cloud/issues/1872
Hey folks, we were trying to deploy neon tech for 10 TB+ data would we need 10 TB storage on the page server ?
this is is Alexander's backlog, still in progress