go-cache-demo
go-cache-demo copied to clipboard
caching concurrently on high volume requests
Hello, I've implemented your code with some minor adjustments to my rest endpoint, which takes a url with parameters and returns json document. In that scenario, your code is perfect for middleware []byte storage. However, I've ran unto a problem where stress testing tool such as bombardier makes with the following parameters "/bombardier --method=GET --latencies --duration=10s --connections=150 127.0.0.1:3030/evs/1/at/153000,674000,1404000,1433926/" and hits the endpoint where I've enabled the caching support. I see that the endpoint, instead of caching a first request, it caches few more past the first one as depicted in the following log output:
INFO 2019/08/09 12:51:54 Starting server on :3030
New page cached: /evs/1/at/153000,674000,1404000,1433926 for 100s
New page cached: /evs/1/at/153000,674000,1404000,1433926 for 100s
New page cached: /evs/1/at/153000,674000,1404000,1433926 for 100s
New page cached: /evs/1/at/153000,674000,1404000,1433926 for 100s
New page cached: /evs/1/at/153000,674000,1404000,1433926 for 100s
New page cached: /evs/1/at/153000,674000,1404000,1433926 for 100s
Any thoughts?
That happens because there's a lot of concurrent requests which will MISS the first cache lookup, so they'll all execute your handler and then cache the response. Subsequent requests will then HIT the cache lookup.
To avoid that you'll need to use a Mutex to lock the execution to a single goroutine per time https://gobyexample.com/mutexes
I will try to make a change and post a gist for you to review, is that ok?
Following your suggestion, I've implemented mutex operation on the middleware, please see it here: https://gist.github.com/mlevkov/41094e7d536c720e2b09f617347eb1d2
Here are the performance numbers --> Before:
Statistics Avg Stdev Max Reqs/sec 8028.24 1397.01 14279.64 Latency 18.67ms 30.76ms 1.99s Latency Distribution 50% 8.37ms 75% 25.89ms 90% 49.43ms 95% 67.02ms 99% 107.95ms HTTP codes: 1xx - 0, 2xx - 722736, 3xx - 0, 4xx - 0, 5xx - 0 others - 0 Throughput: 21.01MB/s
--> After:
Statistics Avg Stdev Max Reqs/sec 17715.84 6642.67 36610.48 Latency 8.46ms 2.80ms 99.89ms Latency Distribution 50% 5.86ms 75% 10.31ms 90% 17.43ms 95% 21.20ms 99% 29.46ms HTTP codes: 1xx - 0, 2xx - 1595959, 3xx - 0, 4xx - 0, 5xx - 0 others - 0 Throughput: 46.40MB/s
Over 2x improvement and no longer reports multiple caches at the same time. Please let me know if I've implemented the change correctly. Any other suggestions?
That's a good implementation, but I'd suggest using RLock as well.
It'd be like:
- get a read lock and defer read unlock
- try to get from cache
- if found, respond and return
- if not found, get a lock and defer unlock
- try to get from cache
- if found, respond and return
- if not found, do handle the request
- save to cache and respond
a read lock doesn't block other read locks, so you can have multiple read locks you only need a "write lock" if you're going to write the cache
I expect this to also give you slightly higher throughput
RLock in addition to the mutex or in place of the mutex?
You can use the same mutex, it should have a RLock func there
Oops. Minor mistake, your mutex should be this one https://golang.org/pkg/sync/#RWMutex
I made a quick change, followed your proposed change and testing it now.
I realized that such was the case. Here is the gist: https://gist.github.com/mlevkov/9db36bed5b818e17f5e958a4dda52e3d
the above implementation performance is:
Statistics Avg Stdev Max
Reqs/sec 7767.36 1390.96 13622.35
Latency 19.33ms 23.62ms 477.92ms
Latency Distribution
50% 6.29ms
75% 11.59ms
90% 76.47ms
95% 94.72ms
99% 125.68ms
HTTP codes:
1xx - 0, 2xx - 698412, 3xx - 0, 4xx - 0, 5xx - 0
others - 0
Throughput: 20.30MB/s
After that Lock, you’ll need an additional lookup to the cache. The reason being that two or more goroutines might try to acquire that lock, but only the first should do the work. The subsequent goroutines, after acquiring the lock, should then hit the cache.
Something like this ?
func (app *application) cached(duration string, handler func(w http.ResponseWriter, r *http.Request)) http.Handler {
var mutex = &sync.RWMutex{}
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
mutex.RLock()
defer mutex.RUnlock()
content := app.cache.Get(r.RequestURI)
if content != nil {
//fmt.Print("Cache Hit!\n")
w.Write(content)
} else {
mutex.RLock()
defer mutex.RUnlock()
if content = app.cache.Get(r.RequestURI); content != nil{
//fmt.Print("Cache Hit!\n")
w.Write(content)
return
} else {
c := httptest.NewRecorder()
handler(c, r)
for k, v := range c.Result().Header { // c.HeaderMap (deprecated)
w.Header()[k] = v
}
w.WriteHeader(c.Code)
content := c.Body.Bytes()
if d, err := time.ParseDuration(duration); err == nil {
fmt.Printf("New page cached: %s for %s\n", r.RequestURI, duration)
app.cache.Set(r.RequestURI, content, d)
} else {
fmt.Printf("Page not cached. err: %s\n", err)
}
}
w.Write(content)
}
})
}
Yep, how does that perform? Also, that last write should be inside the brackets
ok, let me make that change. Here is the performance from that one:
Statistics Avg Stdev Max
Reqs/sec 7729.11 1396.59 13176.17
Latency 19.42ms 23.34ms 475.55ms
Latency Distribution
50% 6.28ms
75% 11.71ms
90% 76.80ms
95% 94.16ms
99% 125.27ms
HTTP codes:
1xx - 0, 2xx - 695032, 3xx - 0, 4xx - 0, 5xx - 0
others - 0
Throughput: 20.21MB/s
here is the performance feedback from the last run with change of w.Write inside the brackets:
Statistics Avg Stdev Max
Reqs/sec 7715.70 1414.81 13246.45
Latency 19.46ms 23.57ms 349.97ms
Latency Distribution
50% 6.27ms
75% 11.56ms
90% 77.64ms
95% 94.57ms
99% 125.92ms
HTTP codes:
1xx - 0, 2xx - 693625, 3xx - 0, 4xx - 0, 5xx - 0
others - 0
Throughput: 20.16MB/s
perhaps I'm not implementing this correctly. Would you be open to making a sample that you see fit the intent correctly? Here is the code that I performs well so far:
{
var mutex = &sync.Mutex{}
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
mutex.Lock()
defer mutex.Unlock()
content := app.cache.Get(r.RequestURI)
if content != nil {
//fmt.Print("Cache Hit!\n")
w.Write(content)
return
} else {
//mutex.RLock()
//defer mutex.RUnlock()
//if content = app.cache.Get(r.RequestURI); content != nil{
// //fmt.Print("Cache Hit!\n")
// w.Write(content)
// return
//} else {
c := httptest.NewRecorder()
handler(c, r)
for k, v := range c.Result().Header { // c.HeaderMap (deprecated)
w.Header()[k] = v
}
w.WriteHeader(c.Code)
content := c.Body.Bytes()
if d, err := time.ParseDuration(duration); err == nil {
fmt.Printf("New page cached: %s for %s\n", r.RequestURI, duration)
app.cache.Set(r.RequestURI, content, d)
} else {
fmt.Printf("Page not cached. err: %s\n", err)
}
w.Write(content)
}
//}
})
}
the performance outcome of this code is:
Statistics Avg Stdev Max
Reqs/sec 17194.11 6801.90 38327.80
Latency 8.73ms 3.14ms 122.09ms
Latency Distribution
50% 5.92ms
75% 10.81ms
90% 18.09ms
95% 21.99ms
99% 30.59ms
HTTP codes:
1xx - 0, 2xx - 1546594, 3xx - 0, 4xx - 0, 5xx - 0
others - 0
Throughput: 44.96MB/s
I'm using bombardier for testing at 150 concurrent connections for 90 second period, just for reference.
That’s interesting, I’d expect it to be better or equal in terms of perf. Will try to have a look at this later on.
That would be great. Thank you!
Hey @mlevkov, try this version.
The difference is that I'm using RLock first and only then a Lock (if value not found on cache).
func cached(duration string, handler func(w http.ResponseWriter, r *http.Request)) http.Handler {
var mutex = &sync.RWMutex{}
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
mutex.RLock()
content := storage.Get(r.RequestURI)
mutex.RUnlock()
if content != nil {
//fmt.Print("Cache Hit!\n")
w.Write(content)
} else {
mutex.Lock()
defer mutex.Unlock()
if content = storage.Get(r.RequestURI); content != nil {
//fmt.Print("Cache Hit!\n")
w.Write(content)
return
}
c := httptest.NewRecorder()
handler(c, r)
for k, v := range c.Result().Header { // c.HeaderMap (deprecated)
w.Header()[k] = v
}
w.WriteHeader(c.Code)
content := c.Body.Bytes()
if d, err := time.ParseDuration(duration); err == nil {
fmt.Printf("New page cached: %s for %s\n", r.RequestURI, duration)
storage.Set(r.RequestURI, content, d)
} else {
fmt.Printf("Page not cached. err: %s\n", err)
}
w.Write(content)
}
})
}
hm, that is interesting indeed. Thank you so much! Will try that shortly and report back with results.
@goenning I've tried the exact details you mentioned here. Here are the reported results:
Statistics Avg Stdev Max
Reqs/sec 7050.03 1293.43 12105.09
Latency 21.30ms 26.71ms 419.52ms
Latency Distribution
50% 7.12ms
75% 12.80ms
90% 84.77ms
95% 104.77ms
99% 136.87ms
HTTP codes:
1xx - 0, 2xx - 633731, 3xx - 0, 4xx - 0, 5xx - 0
others - 0
Throughput: 18.42MB/s
what was your performance like?