fasthttp
fasthttp copied to clipboard
How to Optimize the Performance of My fasthttp Client in a Production Environment
In my production environment, I use fasthttp to make requests to third-party services. During peak traffic times, the fasthttp client experiences some latency, with some delays possibly exceeding several seconds. To investigate, I conducted a stress test and discovered that as the number of connections increases, latency issues arise.
Fasthttp version: v1.55.0
Pressure test environment
Model Name: MacBook Pro
Model Identifier: MacBookPro18,3
Model Number: MKGP3CH/A
Chip: Apple M1 Pro
Total Number of Cores: 8 (6 performance and 2 efficiency)
Memory: 16 GB
Simulating a Third-Party Service with Code:
package main
import (
"log"
"time"
"github.com/valyala/fasthttp"
)
var (
strContentType = []byte("Content-Type")
strApplication = []byte("application/json")
body = []byte("{\"message\": \"Hello, world!\"}")
)
func main() {
go func() {
if err := fasthttp.ListenAndServe("localhost:7001", nil); err != nil {
log.Fatalf("Error in ListenAndServe: %v", err)
}
}()
if err := fasthttp.ListenAndServe("localhost:8001", handler); err != nil {
log.Fatalf("Error in ListenAndServe: %v", err)
}
}
func handler(ctx *fasthttp.RequestCtx) {
begin := time.Now()
// handle request
{
ctx.Response.Header.SetCanonical(strContentType, strApplication)
ctx.Response.SetStatusCode(fasthttp.StatusOK)
ctx.Response.SetBody(body)
}
log.Printf("%v | %s %s %v %v",
ctx.RemoteAddr(),
ctx.Method(),
ctx.RequestURI(),
ctx.Response.Header.StatusCode(),
time.Since(begin),
)
}
Code Snippet for Simulating Third-Party Service Calls
package main
import (
"log"
"net/http"
_ "net/http/pprof"
"time"
"github.com/valyala/fasthttp"
)
var (
client *fasthttp.HostClient
)
const (
readTimeout = 3 * time.Second
writeTimeout = 3 * time.Second
maxConnsPerHost = 2048
maxIdleConnDuration = 3 * time.Minute
)
func main() {
client = &fasthttp.HostClient{
Addr: "localhost:8001",
MaxConns: maxConnsPerHost,
ReadTimeout: readTimeout,
WriteTimeout: writeTimeout,
MaxIdleConnDuration: maxIdleConnDuration,
NoDefaultUserAgentHeader: true,
DisableHeaderNamesNormalizing: true,
DisablePathNormalizing: true,
MaxIdemponentCallAttempts: 1,
}
go func() {
if err := http.ListenAndServe("localhost:7002", nil); err != nil {
log.Fatalf("Error in ListenAndServe: %v", err)
}
}()
if err := fasthttp.ListenAndServe("localhost:8002", handler); err != nil {
log.Fatalf("Error in ListenAndServe: %v", err)
}
}
func api(ctx *fasthttp.RequestCtx) error {
begin := time.Now()
defer func() {
log.Printf("%v | %s %s %v %d",
ctx.RemoteAddr(),
ctx.Method(),
ctx.RequestURI(),
time.Since(begin),
client.ConnsCount(),
)
}()
req := fasthttp.AcquireRequest()
defer fasthttp.ReleaseRequest(req)
req.SetRequestURI("http://localhost:8001")
req.Header.SetMethod(fasthttp.MethodGet)
resp := fasthttp.AcquireResponse()
defer fasthttp.ReleaseResponse(resp)
return client.Do(req, resp)
}
func handler(ctx *fasthttp.RequestCtx) {
if err := api(ctx); err != nil {
ctx.SetStatusCode(fasthttp.StatusInternalServerError)
} else {
ctx.SetStatusCode(fasthttp.StatusOK)
}
}
Results Obtained Using the Load Testing Tool:
1 connection:
➜ ~ wrk -t1 -c1 -d10s http://localhost:8002 --latency
Running 10s test @ http://localhost:8002
1 threads and 1 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 160.03us 802.36us 14.51ms 97.87%
Req/Sec 16.41k 2.30k 18.29k 90.10%
Latency Distribution
50% 52.00us
75% 65.00us
90% 90.00us
99% 4.04ms
164890 requests in 10.10s, 14.62MB read
Requests/sec: 16326.54
Transfer/sec: 1.45MB
10 connections
➜ ~ wrk -t1 -c10 -d10s http://localhost:8002 --latency
Running 10s test @ http://localhost:8002
1 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 622.15us 2.21ms 43.48ms 97.30%
Req/Sec 30.30k 4.38k 39.26k 74.00%
Latency Distribution
50% 279.00us
75% 427.00us
90% 611.00us
99% 10.97ms
301272 requests in 10.00s, 26.72MB read
Requests/sec: 30121.96
Transfer/sec: 2.67MB
50 connections
➜ ~ wrk -t1 -c50 -d10s http://localhost:8002 --latency
Running 10s test @ http://localhost:8002
1 threads and 50 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 1.69ms 1.95ms 36.35ms 96.91%
Req/Sec 32.71k 4.71k 42.19k 73.00%
Latency Distribution
50% 1.46ms
75% 1.83ms
90% 2.28ms
99% 11.05ms
325559 requests in 10.01s, 28.87MB read
Requests/sec: 32526.90
Transfer/sec: 2.88MB
100 connections
➜ ~ wrk -t1 -c100 -d10s http://localhost:8002 --latency
Running 10s test @ http://localhost:8002
1 threads and 100 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 3.70ms 4.27ms 78.96ms 96.63%
Req/Sec 30.67k 5.77k 43.51k 76.00%
Latency Distribution
50% 3.08ms
75% 3.88ms
90% 4.82ms
99% 26.20ms
305183 requests in 10.01s, 27.07MB read
Requests/sec: 30499.69
Transfer/sec: 2.71MB
500 connections
➜ ~ wrk -t1 -c500 -d10s http://localhost:8002 --latency
Running 10s test @ http://localhost:8002
1 threads and 500 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 14.75ms 5.69ms 78.57ms 85.19%
Req/Sec 34.38k 5.13k 46.05k 72.00%
Latency Distribution
50% 14.21ms
75% 17.06ms
90% 19.83ms
99% 39.96ms
342024 requests in 10.02s, 30.33MB read
Socket errors: connect 0, read 637, write 0, timeout 0
Requests/sec: 34131.79
Transfer/sec: 3.03MB
1000 connections
➜ ~ wrk -t1 -c1000 -d10s http://localhost:8002 --latency
Running 10s test @ http://localhost:8002
1 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 30.61ms 10.12ms 110.69ms 77.67%
Req/Sec 32.04k 7.53k 47.23k 76.00%
Latency Distribution
50% 29.75ms
75% 35.21ms
90% 41.99ms
99% 68.50ms
318908 requests in 10.03s, 28.28MB read
Socket errors: connect 0, read 3541, write 0, timeout 0
Requests/sec: 31807.34
Transfer/sec: 2.82MB
1500 connections
➜ ~ wrk -t1 -c1500 -d10s http://localhost:8002 --latency
Running 10s test @ http://localhost:8002
1 threads and 1500 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 44.64ms 16.50ms 212.65ms 87.08%
Req/Sec 33.34k 7.91k 48.98k 78.00%
Latency Distribution
50% 42.72ms
75% 49.30ms
90% 58.31ms
99% 110.18ms
332420 requests in 10.09s, 29.48MB read
Socket errors: connect 0, read 3383, write 469, timeout 0
Requests/sec: 32950.19
Transfer/sec: 2.92MB
2000 connections
➜ ~ wrk -t1 -c2000 -d10s http://localhost:8002 --latency
Running 10s test @ http://localhost:8002
1 threads and 2000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 59.99ms 29.49ms 411.31ms 92.16%
Req/Sec 29.86k 13.71k 46.66k 76.04%
Latency Distribution
50% 55.47ms
75% 64.43ms
90% 74.44ms
99% 201.06ms
285246 requests in 10.09s, 25.30MB read
Socket errors: connect 0, read 16081, write 642, timeout 0
Requests/sec: 28261.07
Transfer/sec: 2.51MB
As the number of connections increases, it leads to higher latency. However, the third-party service still responds quickly; in this example, the response time is measured in microseconds (µs).
I used flame graphs to help with the analysis. It appears that most of the time is spent on system calls, what can I do to reduce response latency in this situation
At 2000 connections I still see a 99% latency of 201.06ms. Is that not good? I makes sense that as the number of connections grows the latency increases as both wrk and fasthttp start to take up more CPU. Did you expect anything else here?