dalli icon indicating copy to clipboard operation
dalli copied to clipboard

Proof of Concept: Automatically store strings as raw values

Open casperisfine opened this issue 2 years ago • 7 comments

NB: I'm opening this as a proof of concept because there is a number of specs that need to be updated. It's menial work that I'd rather not do if there is no interest for such feature, but that I can easily do if the feature is desired.

Context

When a value is already a String, there is little point using Marshal to serialize it. The only benefit is to properly preserve the String encoding, but this can instead be stored as a bitflag on the key.

Benchmark

On a simple benchmark reading a 1MB UTF-8 string, it's about twice faster.

require 'bundler/inline'

gemfile do
  source "https://rubygems.org"
  gem "dalli"
  gem "benchmark-ips"
end

require "dalli"
require "benchmark/ips"

client = Dalli::Client.new("localhost", compress: false)
payload = "B" * 1_000_000
client.set("key", payload)
Benchmark.ips do |x|
  x.report("get 1MB UTF-8") { client.get("key") }
end
$ ruby /tmp/benchmark-dalli.rb
Warming up --------------------------------------
       get 1MB UTF-8   156.000  i/100ms
Calculating -------------------------------------
       get 1MB UTF-8      1.582k (± 2.7%) i/s -      7.956k in   5.031764s
$ ruby -Ilib /tmp/benchmark-dalli.rb
Warming up --------------------------------------
       get 1MB UTF-8   280.000  i/100ms
Calculating -------------------------------------
       get 1MB UTF-8      2.798k (± 4.3%) i/s -     14.000k in   5.012061s

This is inspired by my work on our in-house serializer library: https://github.com/Shopify/paquito/pull/20

casperisfine avatar Nov 29 '22 14:11 casperisfine

Any idea how that 2x benefit scales with payload size? I don't think storing 1 MB strings is particularly unusual, but I'm also not sure it's the highest frequency case. And there's additional conceptual overhead in the API by adding these as explicit encoding options.

How would this look in "real" apps? Would this be a big benefit if, for example, the Rails cache checked if an object was a String before adding and used the encoding flag?

Thoughts?

petergoldstein avatar Nov 29 '22 14:11 petergoldstein

Any idea how that 2x benefit scales with payload size?

It's more or less linear. Marshal is relatively fast at serializing strings since most of it is just adding a prefix and then doing a memcpy. I'll expand the benchmark to test different string sizes.

Would this be a big benefit if, for example, the Rails cache checked if an object was a String before adding and used the encoding flag?

Well, since I refactored it in Rails 7.0, Rails' MemCacheStore always pass a string to dalli. That said we could initialize Dalli with serialize: false, but it wasn't done before so it means breaking backward compat :/

Another advantage of this feature it that it allows to preserve common string encodings when using raw: true.

casperisfine avatar Nov 29 '22 16:11 casperisfine

Here's an updated benchmark.

On my machine (M1 pro), the difference start to be significant at 150KB, and then it grows more or less linearly from there.

Note that this is pretty much a memcpy benchmark, so might change quite a bit based on RAM speed etc.

== 100kB ==
Warming up --------------------------------------
             patched     1.236k i/100ms
Calculating -------------------------------------
             patched     13.394k (± 6.8%) i/s -     66.744k in   5.006565s

Comparison:
             patched:    13394.0 i/s
            baseline:    13216.6 i/s - same-ish: difference falls within error

== 150kB ==
Warming up --------------------------------------
             patched     1.115k i/100ms
Calculating -------------------------------------
             patched     11.133k (± 6.8%) i/s -     55.750k in   5.031202s

Comparison:
             patched:    11132.9 i/s
            baseline:     9636.4 i/s - 1.16x  (± 0.00) slower

== 250kB ==
Warming up --------------------------------------
             patched   780.000  i/100ms
Calculating -------------------------------------
             patched      6.828k (± 6.8%) i/s -     34.320k in   5.049507s

Comparison:
             patched:     6827.6 i/s
            baseline:     5140.8 i/s - 1.33x  (± 0.00) slower

== 500kB ==
Warming up --------------------------------------
             patched   398.000  i/100ms
Calculating -------------------------------------
             patched      3.950k (± 5.4%) i/s -     19.900k in   5.051562s

Comparison:
             patched:     3950.2 i/s
            baseline:     2593.9 i/s - 1.52x  (± 0.00) slower

== 1000kB ==
Warming up --------------------------------------
             patched   223.000  i/100ms
Calculating -------------------------------------
             patched      2.257k (± 4.4%) i/s -     11.373k in   5.048974s

Comparison:
             patched:     2256.8 i/s
            baseline:     1315.9 i/s - 1.71x  (± 0.00) slower
Benchmark source
# frozen_string_literal: true


version = ENV["PATCH"] ? "patched" : "baseline"
if ENV["PATCH"]
  $LOAD_PATH.unshift("lib")
end

require 'bundler/inline'

gemfile do
  source "https://rubygems.org"
  gem "dalli"
  gem "benchmark-ips"
end

require "dalli"
require "benchmark/ips"

client = Dalli::Client.new("localhost", compress: false)
[100, 150, 250, 500, 1_000].each do |size|
  puts "== #{size}kB =="
  payload = "B" * 1_000 * size
  client.set("key", payload)
  Benchmark.ips do |x|
    x.report(version) { client.get("key") }
    x.save!("/tmp/dalli-bench-#{size}kb.data")
    x.compare!
  end
end

casperisfine avatar Nov 29 '22 16:11 casperisfine

Hey @casperisfine , could I know where are the strings from? Are they HTML or random generated strings or query results?

drinkbeer avatar Nov 29 '22 23:11 drinkbeer

@drinkbeer the benchmark source is provided, it's just payload = "B" * 1_000 * size.

The content of the string doesn't matter here because I initialize Dalli with compress: false to not skew the results.

casperisfine avatar Nov 30 '22 07:11 casperisfine

If this feature is deemed undesirable, I'd like to suggest an alternative which is to allow to pass a custom ValueMarshaller, so that users can implement this kind of logic using flags themselves.

casperisfine avatar Dec 01 '22 08:12 casperisfine