mimalloc icon indicating copy to clipboard operation
mimalloc copied to clipboard

Memory usage regression compared to Windows heap allocator

Open Zoxc opened this issue 9 months ago • 6 comments

Testing out mimalloc (v2.2.2) in the Rust compiler shows some large regressions in physical memory use in some scenarios.

I don't observe these regressions in my Rust port and I suspect it is because I walk the entire list of abandoned segments instead of exiting early. I wonder if a flag to do the same in mimalloc could be added.

Zoxc avatar Mar 21 '25 02:03 Zoxc

Ah, that is not great -- I'll look into it. However, in the past months there has been a lot of development on mimalloc v3 (the dev3 branch) which is specifically to address such memory issues (and has an improved ownership model which might be better suited for a rust port as well..).

Would it easy for you to try dev3 and see if it improves matters? (in dev3 the idea is to make abandonment cheap and increase sharing of pages between different threads -- working on a writeup but haven't gotten to it yet). Best, Daan

daanx avatar Mar 21 '25 02:03 daanx

It does appear that v3 fixes the memory regression. I did also do a quick performance check of v2.2.2 (Before) and v3 (After) is slower. Is there a change regrading committed memory that's reducing performance? I see that's significantly reduced.

BenchmarkBeforeAfterBeforeAfterBeforeAfter
TimeTime%Physical MemoryPhysical Memory%Committed MemoryCommitted Memory%
🟣 clap:check1.2042s1.2288s💔 2.04%147.58 MiB147.71 MiB 0.08%261.81 MiB216.67 MiB💚 -17.24%
🟣 hyper:check0.2061s0.2060s -0.02%80.86 MiB78.94 MiB💚 -2.38%195.31 MiB142.50 MiB💚 -27.04%
🟣 regex:check0.6887s0.7010s💔 1.78%108.12 MiB108.02 MiB -0.09%223.32 MiB169.17 MiB💚 -24.24%
🟣 syn:check1.1358s1.1484s💔 1.10%141.14 MiB143.11 MiB💔 1.39%255.29 MiB209.21 MiB💚 -18.05%
Total3.2348s3.2842s💔 1.53%477.70 MiB477.78 MiB 0.02%935.71 MiB737.56 MiB💚 -21.18%
Summary1.0000s1.0123s💔 1.23%1 byte1.00 bytes -0.25%1 byte0.78 bytes💚 -21.64%

Zoxc avatar Mar 21 '25 04:03 Zoxc

Ha, that is good to see! In some of our services v3 reduces memory usage by a lot, but on many small benchmarks the difference is usually less pronounced. On my benchmarks v3 is about as fast as v2 -- maybe we can tune it better to eeck out that last 1.23% ; can you try with MIMALLOC_PURGE_DELAY=-1 ? ( Is there a way for me to run the benchmarks like you did above and get such nice report? (if it is not too complex to set up)). Also, is this on Linux/x64 ? Finally, v2.2.2 should really be as fast as 2.1.7 -- I may have changed the abandoned list parameters and it would be good to fix this anyways regardless of the new v3 version. Thanks again!
(ps. if you are up for it, maybe send me an email sometime and we could chat about your Rust port? )

daanx avatar Mar 21 '25 23:03 daanx

If you want to do a local build you'd need my mimalloc branch of rustc. You need to enable mimalloc with a bootstrap.toml file:

[rust]
codegen-units = 1
mimalloc = true
deny-warnings = false

The last commit in the branch points to a local checkout of https://github.com/purpleprotocol/mimalloc_rust. That contains a mimalloc submodule which will be used. You can then build the compiler with python x.py build library.

To benchmark the compiler I'm using https://github.com/Zoxc/rcb, see the readme on how to set it up. The run above is ./rcb bench --incr-none -n 40 master~win-mimalloc~9 master~win-mimalloc~8 --details none --check. It was done on Windows 10 x64.

Note that mimalloc is only used for Rust allocations, not for C allocations. In non-check builds LLVM does a few of those.

Zoxc avatar Mar 22 '25 00:03 Zoxc

v3 with (After) and without (Before) MIMALLOC_PURGE_DELAY=-1:

BenchmarkBeforeAfterBeforeAfterBeforeAfter
TimeTime%Physical MemoryPhysical Memory%Committed MemoryCommitted Memory%
🟣 clap:check1.2034s1.2103s 0.58%147.77 MiB151.25 MiB💔 2.36%216.66 MiB220.73 MiB💔 1.88%
🟣 hyper:check0.1999s0.2004s 0.29%78.94 MiB78.94 MiB 0.00%142.50 MiB142.50 MiB -0.00%
🟣 regex:check0.6826s0.6806s -0.30%108.03 MiB108.03 MiB 0.00%169.17 MiB169.17 MiB -0.00%
🟣 syn:check1.1267s1.1251s -0.14%142.63 MiB144.00 MiB 0.96%208.61 MiB210.71 MiB💔 1.00%
Total3.2126s3.2164s 0.12%477.37 MiB482.23 MiB💔 1.02%736.95 MiB743.11 MiB 0.84%
Summary1.0000s1.0010s 0.10%1 byte1.01 bytes 0.83%1 byte1.01 bytes 0.72%
It doesn't seem to have much effect.

Zoxc avatar Mar 22 '25 00:03 Zoxc

Not sure if related, but I also have noticed large memory usage regression between 2.1.7 and 2.2.2 (using rust wrapper) on Linux.

2.1.7:

11.30user 0.51system 0:13.55elapsed 87%CPU (0avgtext+0avgdata 2709704maxresident)k
0inputs+0outputs (0major+28945minor)pagefaults 0swaps

2.2.2:

11.70user 1.70system 0:13.61elapsed 98%CPU (0avgtext+0avgdata 4910152maxresident)k
0inputs+0outputs (0major+148644minor)pagefaults 0swaps

There is also much more pagefaults.

Arvamer avatar Apr 09 '25 12:04 Arvamer