Benchmarks icon indicating copy to clipboard operation
Benchmarks copied to clipboard

Add Next.js implementation of the TechEmpower Fortunes benchmark

Open DamianEdwards opened this issue 2 years ago • 2 comments

This is an attempt to add an implementation of the TechEmpower Fortunes benchmark in Next.js. This is interesting to us given Blazor's updates in .NET 8 to support SSR, etc.

Some notes about this implementation:

  • I started by following the Next.js docs as linked to from the "Getting Started" link on their home page
  • The app is using the new "App Router" in Next.js 13, rather than the (legacy?) "Pages Router"
  • The app is downgraded to use Next.js 13.4.0 rather than latest due to this issue
  • It's using the node-postgres PostgreSQL node client with a shared pool
  • The HTML table on the /fortunes page doesn't exactly match the example response in the requirements as not having a <tbody> element causes issues with Next.js partial rendering, i.e. an error is raised in the browser that the app could not be "hydrated" due to an expected <tr> not being present.
  • The response HTML for /fortunes includes a number of extra elements not specified in the templates, including a <meta> viewport element in the <head>, and all the <script> elements at the end of the page required to enable Next.js client-side rendering. I could not find a way to disable either of these.
  • The app is using a custom server in order to enable node clustering (scaling across multiple CPUs) in production. This was adapted from the official custom server example here
  • When running the app locally on my development desktop (Intel Core i9-12900K), after building with npm run build and running with npm run start, with Postgres in a Docker container, and load generated by Bombardier, I see ~1,120 requests per second:
     $ C:\tools\bombardier-windows-amd64.exe http://localhost:3000/fortunes -d 5s -c 100
     Bombarding http://localhost:3000/fortunes for 5s using 100 connection(s)
     [================================================] 5s
     Done!
     Statistics        Avg      Stdev        Max
     Reqs/sec      1114.53     371.69    2442.34
     Latency       88.92ms    13.40ms   158.84ms
     HTTP codes:
       1xx - 0, 2xx - 5683, 3xx - 0, 4xx - 0, 5xx - 0
       others - 0
     Throughput:     9.52MB/s
    

Data from first run on the perf infrastructure:

db
Max CPU Usage (%) 4 ▂ █▂▂▂▂▂▂▂
Max Cores usage (%) 105 ▂ █▂▂▂▂▂▂▂
Max Working Set (MB) 221 ▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁████▆▆▆▆▆▆▆▁
Build Time (ms) 4,926
Start Time (ms) 1,095
Published Size (KB) 370,251
application
Max CPU Usage (%) 100 ████████
Max Cores usage (%) 2,793 ████████
Max Working Set (MB) 6,337 ▂▄▅▆▆▇▇▇████
Build Time (ms) 29,232
Start Time (ms) 1,123
Published Size (KB) 195,736
load
Max CPU Usage (%) 17 ▂▇▇▇██████▆
Max Cores usage (%) 471 ▂▇▇▇██████▆
Max Working Set (MB) 48 ██████████████▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
Max Private Memory (MB) 358 ▃▃▃▃▃▃▃▃▃▃▃▃▃▃███████████████
Start Time (ms) 0
First Request (ms) 551
Requests/sec 913
Requests 13,785
Mean latency (ms) 174.21
Max latency (ms) 603.67
Bad responses 0
Socket errors 256
Read throughput (MB/s) 7.71
Latency 50th (ms) 161.25
Latency 75th (ms) 223.05
Latency 90th (ms) 294.93
Latency 99th (ms) 433.45

DamianEdwards avatar Jun 15 '23 17:06 DamianEdwards

Updated as part of investigating the performance:

  • Added ability to run with no database query via environment variable NO_DB (set to a JS falsey value)
  • Added ability to set number of workers via environment variable WORKER_COUNT
  • Added ability to set Postgres pool max client count via environment variable DB_MAX_CLIENTS & changed its default to os.cpus().length

Ran numbers again with no database and reduced worker count and performance is still under 1,000 RPS.

DamianEdwards avatar Jun 18 '23 23:06 DamianEdwards

New update: I noticed that Next.js is always sending Connection: close response headers. This will dramatically impact request latency. I'm trying to figure out what's going on but none of the obvious things seem to impact it yet, e.g. setting environment variable KEEP_ALIVE_TIMEOUT which is read by the server.js file generated by the standalone build. This only seems to be happening in non-dev so it appears to be something related to optimized "production" build produced by next build.

I've started a discussion on the next.js repo.

UPDATE: I reproduced this behavior on multiple machines (including Mac), in a brand new Next.js app, on multiple versions of Node, and in multiple browsers, so I created an issue on next.js.

DamianEdwards avatar Jun 19 '23 17:06 DamianEdwards