net-http icon indicating copy to clipboard operation
net-http copied to clipboard

Replace Regexp in for headers for perf

Open baweaver opened this issue 1 year ago • 5 comments

I had noticed that Net::HTTP is splitting on a Regexp in the headers file, so wanted to put in a quick patch on that. Here's some of the performance data to back up this change:

require 'benchmark/ips'
require 'memory_profiler'

# Current method
def capitalize(name)
  name.to_s.split(/-/).map {|s| s.capitalize }.join('-')
end

# Enhanced method
def capitalize_new(name)
  name.to_s.split('-').map(&:capitalize).join('-')
end

DEMO_STRING = 'abc-def-xyz'

puts "capitalize(DEMO_STRING): #{capitalize(DEMO_STRING)}"
puts "capitalize_new(DEMO_STRING): #{capitalize_new(DEMO_STRING)}"

raise 'Not equal' unless capitalize(DEMO_STRING) == capitalize_new(DEMO_STRING)

Benchmark.ips do |x|
  x.report("Current capitalize") { capitalize(DEMO_STRING) }
  x.report("Enhanced capitalize") { capitalize_new(DEMO_STRING) }

  x.compare!
end

# Warming up --------------------------------------
#   Current capitalize    74.573k i/100ms
#  Enhanced capitalize   111.936k i/100ms
# Calculating -------------------------------------
#   Current capitalize    716.449k (± 2.9%) i/s -      3.654M in   5.104556s
#  Enhanced capitalize      1.100M (± 2.5%) i/s -      5.597M in   5.093632s

# Comparison:
#  Enhanced capitalize:  1099514.8 i/s
#   Current capitalize:   716449.3 i/s - 1.53x  slower

# Now for memory profiling

puts 'Current capitalize', '=' * 50, ''

MemoryProfiler.report {
  10_000.times { capitalize(DEMO_STRING) }
}.pretty_print

# allocated memory by gem
# -----------------------------------
#    6_480_000  other
# allocated objects by class
# -----------------------------------
#     100_000  String
#      20_000  Array
#      10_000  MatchData

puts '', 'New capitalize', '=' * 50, ''

MemoryProfiler.report {
  10_000.times { capitalize_new(DEMO_STRING) }
}.pretty_print

# allocated memory by gem
# -----------------------------------
#    4_400_000  other
# allocated objects by class
# -----------------------------------
#      90_000  String
#      20_000  Array

We could additionally hoist and freeze the header delimiter for an additional gain, but I wanted to keep this PR minimal and a constant being hoisted would introduce a potential public API surface for users to use.

My reasoning for doing this is that we have memory profiles from running Capybara tests that have this as a hot path in terms of memory and object allocations.

baweaver avatar May 19 '23 21:05 baweaver

Side topic: I've found a few additional areas where we might get some easy wins if you're up to spot check me on some more PRs later @byroot.

baweaver avatar May 19 '23 22:05 baweaver

Sure. But note that I'm not the maintainer, I can't merge your changes. https://github.com/ruby/ruby/blob/master/doc/maintainers.md#libnethttprb-libnethttpsrb

byroot avatar May 20 '23 06:05 byroot

Fair fair. I might ask @nobu, @hsbt, or @nurse if they'd be up to work with me on that as well. Going to go back through some of the memory profiles I have to see where other hot paths are. I know there are a few in Response as well.

baweaver avatar May 20 '23 06:05 baweaver

Would one of @nobu / @hsbt / @nurse be up to take a look at this?

baweaver avatar Aug 03 '23 16:08 baweaver

@baweaver if you pull latest master down, you won't need to explicitly freeze now that https://github.com/ruby/net-http/pull/144 landed 🚀

technicalpickles avatar Aug 16 '23 19:08 technicalpickles

Sorry to late response. This PR helps to work with https://bugs.ruby-lang.org/issues/20205 in the future.

hsbt avatar May 30 '24 09:05 hsbt