neon icon indicating copy to clipboard operation
neon copied to clipboard

Epic: set up infra for periodic perf tests with bigger databases

Open kelvich opened this issue 3 years ago • 1 comments

Right now we have some infrastructure to run periodic tests, but it needs more love. For some amount of time safekeepers were not capable of automatically deleting WAL, so we avoided big databases as they were causing disk overflow on staging. Now safekeepers can clean their WAL so let's start doing more realistic tests.

As result we want to have the following:

  • Grafana overview dashboard with one panel for each test -- panel shows each test result as a single number (seconds or transactions) against test competition time.
  • Links to per-timeline dashboard for each test run.

Approximate steps:

  • [x] bring existing periodic perf tests to life: https://github.com/neondatabase/neon/pull/2037
  • [x] update runner (or use k8s runners): https://github.com/neondatabase/neon/pull/2037
  • [ ] put results to s3 instead of git (or it is stored in postgres? -- find out)
  • [ ] increase scale up to 300GB
  • [ ] increase run time to a few hours: https://github.com/neondatabase/neon/pull/2077
  • [ ] set up / fix grafana dashboards

kelvich avatar Jul 01 '22 11:07 kelvich

https://github.com/neondatabase/neon/issues/2221 is required

bayandin avatar Aug 15 '22 14:08 bayandin

during the meeting today @hlinnaka mentioned that we're ready to increase the scale up more towards the 300GB, e.g. 50GB first?

stepashka avatar Oct 10 '22 15:10 stepashka

is this still blocked by neondatabase/neon#2221 ?

stepashka avatar Oct 10 '22 15:10 stepashka

during the meeting today @hlinnaka mentioned that we're ready to increase the scale up more towards the 300GB, e.g. 50GB first?

I've experemented with pgbench -i, setting up a 50 GB (scale 3424) database took ~ 1h (with enabled prefetch):

------------------------------ Benchmark results -------------------------------
test_pgbench_remote_init[3600-3424].scale: 3424 
test_pgbench_remote_init[3600-3424].init.start_timestamp: 1665488570 
test_pgbench_remote_init[3600-3424].init.end_timestamp: 1665492349 
test_pgbench_remote_init[3600-3424].init.duration: 3,778.830 s
test_pgbench_remote_init[3600-3424].init.drop_tables: 0.030 s
test_pgbench_remote_init[3600-3424].init.create_tables: 0.130 s
test_pgbench_remote_init[3600-3424].init.client_side_generate: 1,419.680 s
test_pgbench_remote_init[3600-3424].init.vacuum: 1,607.980 s
test_pgbench_remote_init[3600-3424].init.primary_keys: 750.590 s

https://github.com/neondatabase/neon/actions/runs/3226525705/jobs/5280150287

A couple of clarifying questions:

  • Should we replace 10GB test with 50GB or run it in addition? We could run them alternately (one day 10GB, next day 50GB)
  • Currently we don't have 50 GB project for reusing. Should I prepare it?

@hlinnaka what do you think?

is this still blocked by https://github.com/neondatabase/neon/issues/2221 ?

Now with creating projects via API it doesn't really blocked by it. I believe with increasing db size we could be blocked by https://github.com/neondatabase/cloud/issues/1872

bayandin avatar Oct 11 '22 13:10 bayandin

Hey folks, we were trying to deploy neon tech for 10 TB+ data would we need 10 TB storage on the page server ?

abhishektvz avatar Feb 14 '23 17:02 abhishektvz

this is is Alexander's backlog, still in progress

stepashka avatar Aug 02 '23 15:08 stepashka