dragonfly
dragonfly copied to clipboard
performance improvements for pipelining
from running:
./dfly_bench --command "zadd foo nx __score__ __data__" -d 1 -c 1 --qps 0 -n 50000000 --pipeline=5000
I saw we have:
- some bottleneck on absl::GetCurrentTimeNanos()
- Bottleneck on ZzlStrtod (i.e. strtod)
- Bottleneck in
Transaction::StoreKeysInArgs(stub transaction in multi squasher), and in general InitByKeys is quite significant (8-9%) because it probably reads the argument slices in another thread (cold ram). I do not know if we can do anything about it, just noting.
Same command with valkey:
docker run --network="host" --rm --cpuset-cpus="2-3" valkey/valkey:9.0 --save "" --appendonly no --protected-mode no
reaches almost double throughput.
(this scenario is not common in prod, the workload is deliberately chosen to showcase the threading/networking part of both systems for a single connection)