gotta-go-fast Weird memory usage of random-find benchmark

The maxrss results of the random-find has some anomalous spikes for unknown reasons:

random-find-maxrss

From IRC:

<squeek502> andrewrk, is it expected that the deflate improvements would lead to a speedup in a lot of the benchmarks (zig-fmt, ast-check-os, etc)?
<andrewrk> mmm I don't think so. there are no dependencies on std lib deflate in the compiler
<squeek502> strange, there's a big spike down at exactly that PR being merged for a lot of benchmarks
<andrewrk> is it around jan 25? that also happens to be the end of the backfill script, so perhaps there was some kind of externality accidentally leaking into the measurements?
<squeek502> ah yeah it is, that makes more sense
<andrewrk> it's most noticeable on Max RSS of random-find
<andrewrk> I have no idea what is happening in this graph
<andrewrk> but the big dip is the end of the backfill script
<andrewrk> ummmm maybe it's the hash table of the records.csv file in the backfill script itself, and we're unintentionally including the parent process memory in the measurement
<andrewrk> so as it goes backwards over time, more measurements collected, more hash table memory
<squeek502> ah that would be interesting, wouldn't explain the CPU instruction count/wall clock/etc dropping though
<andrewrk> agreed, but at least it shows you where the script stopped running
<squeek502> zig-fmt benchmark is the most noticeable for the instruction count/wall clock
<andrewrk> hmm maybe are we accidentally using the (more recent) zig exe that is running the backfill script, instead of using the freshly built zig exe from the relevant commit

I've tried testing the theory that the memory usage of the backfill script itself is being included and I ultimately don't think that's what's happening.

To test this, I added:

    // allocate 100MiB to attempt to skew the results to check if the backfill process is included
    const dummy = try arena.alloc(u8, 100 * 1024 * 1024);
    _ = dummy;

to backfill.zig and then ran the benchmarks once with collect-measurements.zig and then once with backfill.zig for the same commit (e498fb155051f548071da1a13098b8793f527275). Here's the maxrss of random-find for each:

collect-measurements	backfill
2408	2404

Interestingly, values ~2400 were only gotten by any gotta-go-fast benchmarking during the original runs, and it seems like somehow all results since then (most done by the backfill script AFAIK) seem to be getting larger and larger (for no discernible reason, as the benchmark and the hash map implementation haven't changed much in that time). Also, if you look at the graph, there are actually quite a few of these mini-spikes which are also presumably in line with where the backfill script was run previously.

So, no clue what's going on here still, but hopefully I've narrowed it down slightly. Will probably try running the backfill script locally with just random-find next, and seeing if I can reproduce the increasing maxrss results.

Jun 03 '22 00:06 squeek502

Can't seem to reproduce it when just running random-find:

maxrss	timestamp	benchmark_name	commit_hash	commit_timestamp	zig_version
2408	1654218527	random-find	288e89b606b46328a5ab358b2eef2c5dc277bc8f	1654127128	0.10.0-dev.2472+288e89b60
2408	1654218803	random-find	a4cdb49a58f62776b73ad873b25243e65ac29266	1654124353	0.10.0-dev.2471+a4cdb49a5
2404	1654219072	random-find	b82cccc9e9b2230097f81fecec12ac0fdae97518	1654123401	0.10.0-dev.2464+b82cccc9e
2408	1654219344	random-find	b095aa6986badc3b8c2255ad2c824ca4ea9959d9	1654117558	0.10.0-dev.2463+b095aa698
2408	1654219617	random-find	de14fba2478019ba6070407502b422d349ebf754	1654109580	0.10.0-dev.2462+de14fba24
2408	1654219893	random-find	69323fc1432146bf166175060dea9d0248dbeba4	1654107559	0.10.0-dev.2461+69323fc14
2404	1654220171	random-find	cbb806da6e0b8f2f4d0d12ca57618c291acfa0c0	1654062999	0.10.0-dev.2460+cbb806da6

maxrss is consistent across commits when running the backfill script for me locally. Will try running the full set of benchmarks to see if somehow memory usage is leaking across different benchmarks or something (seems impossible but might as well try ruling it out).

EDIT: Also doesn't occur when running the full set of benchmarks, I still get ~2400 consistently.

Jun 03 '22 01:06 squeek502

To test this, I added:

Note that you need to poke every page (i.e. memset all the bytes) otherwise it won't count towards the RSS.

Jun 03 '22 04:06 andrewrk

Note that you need to poke every page (i.e. memset all the bytes) otherwise it won't count towards the RSS.

Changed the added code to:

    // allocate 100MiB to attempt to skew the results to check if the backfill process is included
    var dummy = try arena.alloc(u8, 100 * 1024 * 1024);
    std.mem.set(u8, dummy, 1);
    std.debug.print("last byte is {}\n", .{dummy[dummy.len - 1]});

and verified that last byte is 1 was printed.

Still got the same result: 2404 maxrss for find-random when using the backfill script.

Jun 03 '22 04:06 squeek502

Ok, finally got something. I downloaded the records.csv from ziglang.org/perf, deleted the entries for e498fb155051f548071da1a13098b8793f527275 and then ran:

$ZIG run collect-measurements.zig -- records-upstream.csv $ZIG $COMMIT_SHA1 $COMMIT_TIMESTAMP

for e498fb155051f548071da1a13098b8793f527275. random-find had a reported maxrss of 11272.

So, it's the collect-measurements.zig memory usage that's being included, and it's increasing based on the size of the csv file (since it does readFileAlloc on the csv file). random-find is the only benchmark that's noticeably affected since its real maxrss is low enough for it to be dwarfed by the readFileAlloc.

EDIT: Double checked this by adding the dummy allocation code from my previous comment into collect-measurements.zig. The maxrss of random-find jumped to 103000 so this is definitely what's happening.

Jun 03 '22 04:06 squeek502

This seems to be the relevant note from getrusage(2):

Resource usage metrics are preserved across an execve(2).

Jun 03 '22 05:06 squeek502

I've been trying to measure the max_rss of some very small, hand-crafted assembly programs, and ran into the same issue with getrusage(2)!

Although what I've been doing is unrelated here, this might be a good read for you: https://jkz.wtf/random-linux-oddity-1-ru_maxrss since it mentions parts of the linux source where the metrics are maintained.

Jan 11 '24 01:01 acheronfail