ruby
ruby copied to clipboard
Ruby with jemalloc
I would like to create a ruby version compiled with jemalloc. Is this something the community needs? Is it encouraged to make an official contribution. If, so I can try to create an official contribution.
We'd like to do the same thing, and have started down the path of building our own ruby docker image with jemalloc support. That involves a whole bunch of copy & pasting of Dockerfiles
from this repo.
So it would be great if this option could be included as an option for the images published on docker hub. And I think a lot of other ruby users would find this valuable.
@jonmoter could you open PR, in [WIP] status I will also contribute to this PR and add my stuff
@Vad1mo @jonmoter something like this? https://github.com/docker-library/ruby/pull/190
If anyone is looking to run this now, I've forked this repository, merged #192, and added a public/automated docker hub that can be found here: https://hub.docker.com/r/swipesense/ruby-jemalloc
We've been using this in production for a couple weeks now and have seen great results. I will maintain that docker hub and repo until this repo implements this request.
@jorihardman amazing work.. can you open the issue register + any plans to support Alpine?
@ashleyhull-versent thanks for the alpine PR. Images are being built as a type this.
https://github.com/docker-library/ruby/pull/198#issuecomment-431638495
We're still hesitant to add this to the image without a recommendation by Ruby upstream. In the meantime, you can trivially add support in your own Dockerfile without recompiling ruby through the magic of LD_PRELOAD
:
FROM ruby:2.6
RUN apt-get update && apt-get install libjemalloc1 && rm -rf /var/lib/apt/lists/*
ENV LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1
# gleaned from a few sources
# https://github.com/jemalloc/jemalloc/wiki/Getting-Started
# https://brandonhilkert.com/blog/reducing-sidekiq-memory-usage-with-jemalloc/
And done, ruby now uses jemalloc. :tada: :taco:
@yosifkit I tried using LD_PRELOAD
, but I have no clue why jemalloc isn't being loaded:
FROM ruby:2.4.2
RUN apt-get update && apt-get install -y libjemalloc-dev libjemalloc1
ENV LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1
...
Then, inside the container:
> echo $LD_PRELOAD
/usr/lib/x86_64-linux-gnu/libjemalloc.so.1
> ruby -r rbconfig -e "puts RbConfig::CONFIG['LIBS']"
-lpthread -ldl -lcrypt -lm
Any idea why jemalloc isn't listed? :crying_cat_face:
Edit: ruby:2.6 won't work either.
@ftuyama in Ruby 2.6 I had to look for jemalloc
in RbConfig::CONFIG['MAINLIBS']
(this is on my laptop, not in docker - we use the alpine image and there is an issue with jemalloc/musl right now)
I am having trouble with jemalloc not being loaded as well.
@tjwallace I am using Ruby 2.6 and I got nothing in RbConfig::CONFIG['MAINLIBS']
. :cry:
Using a docker image where ruby is compiled using jemalloc, jemalloc is actually in MAINLIBS
: https://hub.docker.com/r/swipesense/ruby-jemalloc
> ruby -r rbconfig -e "puts RbConfig::CONFIG['MAINLIBS']"
-lz -lpthread -lrt -lrt -ljemalloc -ldl -lcrypt -lm
But I couldn't load it with LD_PRELOAD
.
The whole point of LD_PRELOAD would be that Ruby isn't even aware it's being used -- it should be completely transparent to Ruby.
Normally when you do the following (when you see output) ruby should be using jemalloc.
MALLOC_CONF=stats_print:true ruby -e "exit"
Nice @butsjoh! I tested using @yosifkit's example above with MALLOC_CONF
set to stats_print:true
and got the following output: :muscle: :+1:
___ Begin jemalloc statistics ___
Version: 3.6.0-0-g46c0af68bd248b04df75e4f92d5fb804c3d75340
Assertions disabled
Run-time option settings:
opt.abort: false
opt.lg_chunk: 22
opt.dss: "secondary"
opt.narenas: 48
opt.lg_dirty_mult: 3
opt.stats_print: true
opt.junk: false
opt.quarantine: 0
opt.redzone: false
opt.zero: false
opt.tcache: true
opt.lg_tcache_max: 15
CPUs: 12
Arenas: 48
Pointer size: 8
Quantum size: 16
Page size: 4096
Min active:dirty page ratio per arena: 8:1
Maximum thread-cached size class: 32768
Chunk size: 4194304 (2^22)
Allocated: 37482352, active: 38195200, mapped: 46137344
Current active ceiling: 41943040
chunks: nchunks highchunks curchunks
11 11 11
huge: nmalloc ndalloc allocated
1 0 33554432
arenas[0]:
assigned threads: 1
dss allocation precedence: disabled
dirty pages: 1133:100 active:dirty, 1 sweep, 3 madvises, 265 purged
allocated nmalloc ndalloc nrequests
small: 2486128 36132 13694 70467
large: 1441792 157 51 666
total: 3927920 36289 13745 71133
active: 4640768
mapped: 8388608
bins: bin size regs pgs allocated nmalloc ndalloc nrequests nfills nflushes newruns reruns curruns
0 8 501 1 12976 1725 103 2022 19 2 4 2 4
1 16 252 1 30672 2175 258 3118 25 5 8 7 8
2 32 126 1 137600 5400 1100 11084 54 11 39 50 35
3 48 84 1 249984 7728 2520 9328 92 30 62 135 62
4 64 63 1 120960 5670 3780 13134 90 60 43 191 30
5 80 50 1 84000 4550 3500 9348 91 70 29 189 27
6 96 84 2 183648 2016 103 2932 29 4 24 3 23
7 112 72 2 30800 324 49 493 8 5 4 2 4
8 128 63 2 91520 1068 353 3702 20 7 13 13 12
9 160 51 2 78080 508 20 2011 12 2 10 0 10
10 192 63 3 41472 312 96 464 8 5 6 2 4
11 224 72 4 38304 270 99 304 6 5 5 4 3
12 256 63 4 42240 297 132 411 7 5 6 3 3
13 320 63 5 443200 1471 86 1826 33 3 23 4 22
14 384 63 6 46080 218 98 255 6 5 3 1 2
15 448 63 7 150976 376 39 461 8 3 7 1 6
16 512 63 8 58368 186 72 230 6 6 2 0 2
17 640 51 8 111360 901 727 8303 22 16 6 33 6
18 768 47 9 62208 149 68 149 7 6 3 1 2
19 896 45 10 43904 111 62 86 4 5 3 0 2
20 1024 63 16 73728 123 51 231 5 6 2 0 2
21 1280 51 16 52480 100 59 93 4 6 1 0 1
22 1536 42 16 49152 83 51 112 4 5 2 0 1
23 1792 38 17 21504 50 38 49 4 5 1 0 1
24 2048 65 33 73728 105 69 158 4 6 1 0 1
25 2560 52 33 79360 97 66 72 5 6 1 0 1
26 3072 43 33 49152 63 47 66 3 4 1 0 1
27 3584 39 35 28672 56 48 25 4 6 1 0 1
large: size pages nmalloc ndalloc nrequests curruns
4096 1 22 5 117 17
8192 2 38 7 353 31
12288 3 11 2 48 9
16384 4 55 13 98 42
20480 5 1 1 1 0
24576 6 5 2 5 3
28672 7 1 1 1 0
32768 8 3 1 22 2
36864 9 3 3 3 0
[2]
49152 12 2 2 2 0
[2]
61440 15 7 7 7 0
[4]
81920 20 1 0 1 1
[3]
98304 24 2 1 2 1
[4]
118784 29 3 3 3 0
[18]
196608 48 1 1 1 0
[8]
233472 57 1 1 1 0
[198]
1048576 256 1 1 1 0
[762]
--- End jemalloc statistics ---
I am not an expert on the matter but that is what is saw other people saying about it to check if it using it. I do not fully understand yet how installing jemalloc on the system and setting LD_PRELOAD affects other software except ruby. So if anybody can explain me that? :)
@tianon You mentioned that it should work transparently for ruby by setting LD_PRELOAD but does it then affect other software as well as opposed to compile it with ruby?
Yeah, it will likely make most applications switch to jemalloc if they're invoked as a sub-process of Ruby (or invoked with the environment variable set), although that could be avoided by explicitly resetting that environment variable when spawning the subprocess from Ruby.
I assume this is still broken on alpine? could anybody confirm this?
Not sure if there are still issues with running Ruby/jemallloc on Alpine (the LD_PRELOAD
hackery could be used to test it), but the new problem with Alpine is that they dropped jemalloc as unmaintained: https://git.alpinelinux.org/aports/commit/main/jemalloc?id=21836d30f5db21f9f3bb4fe3d70ddd4802ed9541. So it does not exist in the newest release, Alpine 3.9.
Not sure if there are still issues with running Ruby/jemallloc on Alpine (the
LD_PRELOAD
hackery could be used to test it), but the new problem with Alpine is that they dropped jemalloc as unmaintained: https://git.alpinelinux.org/aports/commit/main/jemalloc?id=21836d30f5db21f9f3bb4fe3d70ddd4802ed9541. So it does not exist in the newest release, Alpine 3.9.
As stated in this comment (https://github.com/jemalloc/jemalloc/issues/1443#issuecomment-466500866), Alpine uses musl libc which internally uses malloc and there are no plans to support jemalloc for now.
We're still hesitant to add this to the image without a recommendation by Ruby upstream. In the meantime, you can trivially add support in your own Dockerfile without recompiling ruby through the magic of
LD_PRELOAD
:FROM ruby:2.6 RUN apt-get update && apt-get install libjemalloc1 && rm -rf /var/lib/apt/lists/* ENV LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 # gleaned from a few sources # https://github.com/jemalloc/jemalloc/wiki/Getting-Started # https://brandonhilkert.com/blog/reducing-sidekiq-memory-usage-with-jemalloc/
And done, ruby now uses jemalloc. 🎉 🌮
this seems to work in my env, with Rails 5.2 and Ruby 2.5.3 and docker, thanks!
If i try from 2.5
FROM ruby:2.5 RUN apt-get update && apt-get install libjemalloc1 && rm -rf /var/lib/apt/lists/*
libjemalloc1 is not an available package but libjemalloc2. If I isntall the latest then it does not appear when running
ruby -r rbconfig -e "puts RbConfig::CONFIG['LIBS']"
nor
ruby -r rbconfig -e "puts RbConfig::CONFIG['MAINLIBS']"
You'll also need to set LD_PRELOAD
and for libjemalloc
it probably needs to be /usr/lib/x86_64-linux-gnu/libjemalloc.so.2
.
If i try from 2.5
FROM ruby:2.5 RUN apt-get update && apt-get install libjemalloc1 && rm -rf /var/lib/apt/lists/*
libjemalloc1 is not an available package but libjemalloc2. If I isntall the latest then it does not appear when running
ruby -r rbconfig -e "puts RbConfig::CONFIG['LIBS']"
nor
ruby -r rbconfig -e "puts RbConfig::CONFIG['MAINLIBS']"
Same issue here
@tianon I tried that, but no changes
:confused:
FROM ruby:3.0-buster
RUN apt-get update && apt-get install libjemalloc2 && rm -rf /var/lib/apt/lists/*
ENV LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2
$ docker run -it --rm --env MALLOC_CONF=stats_print:true 11636dada43a ruby -e 'exit'
___ Begin jemalloc statistics ___
Version: "5.1.0-0-g61efbda7098de6fe64c362d309824864308c36d4"
Build-time option settings
config.cache_oblivious: true
config.debug: false
config.fill: true
config.lazy_lock: false
config.malloc_conf: ""
config.prof: true
config.prof_libgcc: true
config.prof_libunwind: false
config.stats: true
config.utrace: false
config.xmalloc: false
Run-time option settings
opt.abort: false
opt.abort_conf: false
opt.retain: true
opt.dss: "secondary"
opt.narenas: 48
opt.percpu_arena: "disabled"
opt.metadata_thp: "disabled"
opt.background_thread: false (background_thread: false)
opt.dirty_decay_ms: 10000 (arenas.dirty_decay_ms: 10000)
opt.muzzy_decay_ms: 10000 (arenas.muzzy_decay_ms: 10000)
opt.junk: "false"
opt.zero: false
opt.tcache: true
opt.lg_tcache_max: 15
opt.thp: "default"
opt.prof: false
opt.prof_prefix: "jeprof"
opt.prof_active: true (prof.active: false)
opt.prof_thread_active_init: true (prof.thread_active_init: false)
opt.lg_prof_sample: 19 (prof.lg_sample: 0)
opt.prof_accum: false
opt.lg_prof_interval: -1
opt.prof_gdump: false
opt.prof_final: false
opt.prof_leak: false
opt.stats_print: true
opt.stats_print_opts: ""
Profiling settings
prof.thread_active_init: false
prof.active: false
prof.gdump: false
prof.interval: 0
prof.lg_sample: 0
Arenas: 48
Quantum size: 16
Page size: 4096
Maximum thread-cached size class: 32768
Number of bin size classes: 36
Number of thread-cache bin size classes: 41
Number of large size classes: 196
Allocated: 38019880, active: 38617088, metadata: 3151800 (n_thp 0), resident: 42512384, mapped: 45674496, retained: 7278592
Background threads: 0, num_runs: 0, run_interval: 0 ns
n_lock_ops n_waiting n_spin_acq n_owner_switch total_wait_ns max_wait_ns max_n_thds
background_thread 4 0 0 1 0 0 0
ctl 2 0 0 1 0 0 0
prof 0 0 0 0 0 0 0
arenas[0]:
assigned threads: 1
uptime: 47998680
dss allocation precedence: "secondary"
decaying: time npages sweeps madvises purged
dirty: 10000 187 0 0 0
muzzy: 10000 0 0 0 0
allocated nmalloc ndalloc nrequests
small: 2413352 28339 5535 60829
large: 35606528 82 31 82
total: 38019880 28421 5566 60911
active: 38617088
mapped: 45674496
retained: 7278592
base: 3123120
internal: 28680
metadata_thp: 0
tcache_bytes: 287624
resident: 42512384
n_lock_ops n_waiting n_spin_acq n_owner_switch total_wait_ns max_wait_ns max_n_thds
large 10 0 0 1 0 0 0
extent_avail 403 0 0 3 0 0 0
extents_dirty 712 0 0 3 0 0 0
extents_muzzy 284 0 0 3 0 0 0
extents_retained 567 0 0 3 0 0 0
decay_dirty 8 0 0 1 0 0 0
decay_muzzy 8 0 0 1 0 0 0
base 433 0 0 3 0 0 0
tcache_list 4 0 0 1 0 0 0
bins: size ind allocated nmalloc ndalloc nrequests curregs curslabs regs pgs util nfills nflushes nslabs nreslabs n_lock_ops n_waiting n_spin_acq n_owner_switch total_wait_ns max_wait_ns max_n_thds
8 0 12792 1651 52 2047 1599 4 512 1 0.780 18 2 4 0 28 0 0 1 0 0 0
16 1 43360 2750 40 3771 2710 11 256 1 0.962 29 1 11 0 44 0 0 1 0 0 0
32 2 165600 5700 525 13167 5175 41 128 1 0.986 61 6 42 29 113 0 0 1 0 0 0
48 3 230400 6500 1700 9353 4800 19 256 3 0.986 65 17 22 58 110 0 0 1 0 0 0
64 4 131072 3264 1216 9367 2048 32 64 1 1 51 19 32 138 105 0 0 1 0 0 0
80 5 62240 956 178 2509 778 4 256 5 0.759 23 5 4 6 35 0 0 1 0 0 0
96 6 116352 1650 438 3020 1212 10 128 3 0.946 18 6 12 11 41 0 0 1 0 0 0
112 7 120848 1100 21 1281 1079 5 256 7 0.842 14 3 5 0 25 0 0 1 0 0 0
128 8 59136 464 2 1514 462 15 32 1 0.962 16 1 15 0 35 0 0 1 0 0 0
160 9 51360 450 129 1892 321 3 128 5 0.835 7 5 4 0 20 0 0 1 0 0 0
192 10 32256 256 88 305 168 3 64 3 0.875 6 4 4 1 18 0 0 1 0 0 0
224 11 28448 225 98 336 127 1 128 7 0.992 5 5 1 0 14 0 0 1 0 0 0
256 12 62976 252 6 442 246 16 16 1 0.960 18 2 16 0 39 0 0 1 0 0 0
320 13 388160 1232 19 1628 1213 19 64 5 0.997 24 2 19 2 48 0 0 1 0 0 0
384 14 60672 192 34 366 158 5 32 3 0.987 8 3 5 2 19 0 0 1 0 0 0
448 15 94528 272 61 298 211 4 64 7 0.824 8 5 5 0 22 0 0 1 0 0 0
512 16 48640 104 9 215 95 12 8 1 0.989 14 4 13 3 35 0 0 1 0 0 0
640 17 73600 736 621 7484 115 8 32 5 0.449 29 22 10 39 66 0 0 1 0 0 0
768 18 26112 56 22 130 34 3 16 3 0.708 5 6 3 1 17 0 0 1 0 0 0
896 19 40320 88 43 85 45 2 32 7 0.703 5 4 3 1 16 0 0 1 0 0 0
1024 20 47104 55 9 153 46 12 4 1 0.958 6 4 14 2 29 0 0 1 0 0 0
1280 21 44800 64 29 117 35 3 16 5 0.729 6 5 4 0 19 0 0 1 0 0 0
1536 22 19968 30 17 74 13 3 8 3 0.541 4 4 5 3 18 0 0 1 0 0 0
1792 23 21504 36 24 51 12 2 16 7 0.375 5 5 2 1 15 0 0 1 0 0 0
2048 24 69632 50 16 414 34 18 2 1 0.944 6 4 26 0 47 0 0 1 0 0 0
2560 25 35840 32 18 40 14 3 8 5 0.583 6 5 5 2 21 0 0 1 0 0 0
3072 26 27648 20 11 31 9 3 4 3 0.750 5 5 6 2 22 0 0 1 0 0 0
3584 27 7168 13 11 7 2 1 8 7 0.250 4 4 2 1 14 0 0 1 0 0 0
4096 28 49152 23 11 217 12 12 1 1 1 5 6 23 0 48 0 0 1 0 0 0
5120 29 20480 16 12 267 4 1 4 5 1 4 5 4 1 19 0 0 1 0 0 0
6144 30 36864 22 16 18 6 4 2 3 0.750 4 5 10 4 28 0 0 1 0 0 0
7168 31 14336 14 12 3 2 1 4 7 0.500 3 3 5 0 18 0 0 1 0 0 0
8192 32 114688 25 11 140 14 14 1 2 1 5 5 25 0 49 0 0 1 0 0 0
10240 33 30720 12 9 71 3 2 2 5 0.750 2 3 6 0 18 0 0 1 0 0 0
12288 34 24576 17 15 14 2 2 1 3 1 3 4 17 0 42 0 0 1 0 0 0
14336 35 0 12 12 2 0 0 2 7 1 2 3 6 0 20 0 0 1 0 0 0
large: size ind allocated nmalloc ndalloc nrequests curlextents
16384 36 720896 56 12 80 44
20480 37 0 1 1 1 0
24576 38 24576 3 2 3 1
28672 39 28672 1 0 1 1
32768 40 0 2 2 14 0
40960 41 0 3 3 3 0
49152 42 49152 2 1 2 1
---
65536 44 0 6 6 6 0
81920 45 81920 1 0 1 1
98304 46 98304 1 0 1 1
---
131072 48 0 2 2 2 0
---
262144 52 0 1 1 1 0
---
393216 54 0 1 1 1 0
---
1048576 60 1048576 1 0 1 1
---
33554432 80 33554432 1 0 1 1
---
--- End jemalloc statistics ---
FWIW - We've been running jemalloc 5.2.1 in production (for the better part of a year) using the LD_PRELOAD
method and it's served us wonderfully over the default MUSL allocator in Alpine 3.x. Having a build variant, or having Alpine default to jemalloc would be a welcome improvement to our Docker pipeline.
how installing jemalloc on the system and setting LD_PRELOAD affects other software except ruby.
@butsjoh @tianon I have the same question! We experiment with it and compare two options: (a) Ruby with statically linked jemalloc
alone and (b) both statically linked Ruby and dynamically linked with LD_PRELOAD so all fork()'ed processes in the container get benefits from memory allocation by jemalloc
.
It's early to make conclusions, but we started to see some improvements with latency, especially from ruby gems that dependent on C++ libraries so require compilation like mysql2 (https://github.com/brianmario/mysql2/tree/master/ext/mysql2) and openssl (https://github.com/ruby/openssl/tree/master/ext/openssl) and pretty much anything that involves network sockets.
Based on prelimenary testing results non-Ruby processes stick with default memory allocator when we only statically linked Ruby with jemalloc
(a), we see improvements overall when dynamically linked jemalloc
via LD_PRELOAD (b) so every process gets its benefits automatically.
Is there a recommendation of which libjemalloc
version to use?
I see in the messages above that libjemalloc1
is being used with Ruby2.6 and libjemalloc2
with Ruby3.
Can I use, for example, libjemalloc2
with ruby:2.6-buster
?
We’re using libjemalloc2
on ruby:2.7.4-slim-bullseye
via ENV LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2
without issue.
3.0.2 on alpine 3.14 seems to work too (requires "manual" jemalloc build tho)
Dockerfile
:
FROM ruby:3.0.2-alpine3.14 AS builder
RUN apk add build-base
RUN wget -O - https://github.com/jemalloc/jemalloc/releases/download/5.2.1/jemalloc-5.2.1.tar.bz2 | tar -xj && \
cd jemalloc-5.2.1 && \
./configure && \
make && \
make install
FROM ruby:3.0.2-alpine3.14
COPY --from=builder /usr/local/lib/libjemalloc.so.2 /usr/local/lib/
ENV LD_PRELOAD=/usr/local/lib/libjemalloc.so.2
RUN MALLOC_CONF=stats_print:true ruby -e "exit"
# ... your stuffs
^ this one is multistage, but does not have to be.
RUN gem install racc -v '1.5.2'
gives an error with libjemalloc2 (works fine without it), while building linux/x86_64 image on M1 Mac.
FROM ruby:2.7.1-slim RUN apt-get update -qq apt-get install -y --no-install-recommends build-essential libjemalloc2 ENV LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2
