zig icon indicating copy to clipboard operation
zig copied to clipboard

std.mem.zeroes high memory usage & compile time

Open DavDag opened this issue 1 year ago • 5 comments

Zig Version

0.11.0-dev.2546+cb54e9a3c

Steps to Reproduce and Observed Behavior

Tthe compile time of the following code is enourmous (~4 min) and the ram usage too (~6GB).

const std = @import("std");

pub fn main() !void {
    const w: u32 = 1080;
    const h: u32 = 720;
    _ = std.mem.zeroes([w * h * 3]u8);
}

Simply substituting it with this alternative

    _ = std.mem.zeroes([w * h * 3]u8); // old
    _ = [1]u8{0} ** (size); // new

Reduced the compile time cost to ~1sec and the ram cost to ~45MB.


using std.mem.zeroes

Measure-Command {zig build -Doptimize=Debug run}


Days              : 0
Hours             : 0
Minutes           : 3
Seconds           : 43
Milliseconds      : 378
Ticks             : 2233783981
TotalDays         : 0,00258539812615741
TotalHours        : 0,0620495550277778
TotalMinutes      : 3,72297330166667
TotalSeconds      : 223,3783981
TotalMilliseconds : 223378,3981

image

using [1]u8{0} ** (size)

Measure-Command {zig build -Doptimize=Debug run}


Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 1
Milliseconds      : 51
Ticks             : 10510305
TotalDays         : 1,21647048611111E-05
TotalHours        : 0,000291952916666667
TotalMinutes      : 0,017517175
TotalSeconds      : 1,0510305
TotalMilliseconds : 1051,0305
image

My hardware (&os):

  • Windows 11 Pro (version 10.0.22621 Build 22621)
  • i-13900k
  • 128GB ddr5

Expected Behavior

Heavily shorter compile time & lower memory usage

DavDag avatar Apr 13 '23 11:04 DavDag

Looks like LLVM doesn't like us returning 800kb arrays, prefixing the zeroes call with comptime makes the issue go away.

Vexu avatar Apr 13 '23 13:04 Vexu

Got similar results (compile-time-wise & memory-wise) with this piece of code:

const std = @import("std");

test {
    const w: u32 = 1080;
    const h: u32 = 720;

    var pixels: [w * h * 3]f32 = .{0.0} ** (w * h * 3);
    _ = pixels;
}

I guess there's something the LLVM does not like about allocating 2Mb on the stack.


Simply adding [1]f32 before {0.0} "fixes" it again.

const std = @import("std");

test {
    const w: u32 = 1080;
    const h: u32 = 720;

    var pixels: [w * h * 3]f32 = [1]f32{0.0} ** (w * h * 3);
    _ = pixels;
}

DavDag avatar Apr 14 '23 08:04 DavDag

Got similar results (compile-time-wise & memory-wise) with this piece of code:

That's because you're making a struct with over 2 million fields. Arrays are able to use an optimization to represent repeated values but if you assign to the middle you should see the same performance.

Vexu avatar Apr 14 '23 08:04 Vexu

That's because you're making a struct with over 2 million fields. Arrays are able to use an optimization to represent repeated values but if you assign to the middle you should see the same performance.

Sorry, i do not understand what you're meaning with "assign to the middle"


Edit: I'm understanding now, simply adding an assignment istruction fixed this instance

const std = @import("std");

test {
    const w: u32 = 1080;
    const h: u32 = 720;

    var pixels: [w * h * 3]f32 = .{0.0} ** (w * h * 3);
    pixels[69] = 42;
}

(Thanks again @travisstaloch)

DavDag avatar Apr 14 '23 08:04 DavDag

To help a little bit with investigating:

const std = @import("std");

const Encho = enum { A, B };

pub fn main() !void {
    const T = std.mem.zeroes([123456]Encho); // Does not finish
    // const T: [123456]Encho = undefined; // Fine, ~1s
    // const T =  [1]Encho{.A} ** 123456; // Fine, ~1s
    // const T = std.mem.zeroes([123456]u8); // Fine, ~2s
    std.log.info("{}", .{T.len});
}

Notice there are 2 ways to fix the issue: either replace std.mem.zeroes or replace the enum.

zig version: 0.12.0 os: Darwin root:xnu-10063.101.17~1/RELEASE_ARM64_T6000 arm64

StanislavNikolov avatar Apr 24 '24 00:04 StanislavNikolov