zig Improve code generation of zig.fmt

trafficstars

Today the code generated by zig.fmt is far from desirable.

Here is a sketch that I made up in a afternoon. Of course it's not a complete implementation, so it can't print everything yet, and most importantly it doesn't handle a runtime known writer properly yet, but it shows what the main issues are.

In my opinion, doing log("Hello, {d}!", .{asm("rdrand %[reg]" : [reg] "=r" (->u64))}); should be entirely inlined, and instead just call a couple of functions, and be equivalent to the following code (and this is what I achieved)

    call get_some_log_lock
    push rax
    lea rdi, "Hello, "
    call print_string
    rdrand rdi
    call print_runtime_decimal_value
    lea rdi, "!\n"
    call print_string
    pop rdi
    call release_log_lock

Today, you get the worst of both worlds. You get one function for each call site, which parses the string at runtime, causing a lot of code bloat. Let's get some numbers on this. Replacing the std.fmt usage in the Florence OS kernel resulted in the following change:

$ bloaty --debug-vmaddr=0xF zig-out/bin/Flork_stivale2_x86_64_newlog -- zig-out/bin/Flork_stivale2_x86_64
    FILE SIZE        VM SIZE
 --------------  --------------
  +0.1%     +14  [ = ]       0    .debug_abbrev
   +22%      +4  [ = ]       0    [Unmapped]
  [ = ]       0  [DEL]      -8    [LOAD #2 [RW]]
  [ = ]       0  -0.0%     -24    .bss
  -2.3% -1.64Ki  [ = ]       0    .debug_pubnames
 -10.0% -3.03Ki  [ = ]       0    .debug_frame
 -12.6% -5.08Ki  [ = ]       0    .debug_pubtypes
  -5.1% -6.32Ki  [ = ]       0    .debug_str
 -68.4% -7.01Ki -68.8% -7.01Ki    .data
 -10.8% -7.68Ki -10.8% -7.68Ki    .rodata
 -24.3% -11.6Ki  [ = ]       0    .strtab
 -43.1% -17.1Ki  [ = ]       0    .symtab
 -17.4% -35.3Ki  [ = ]       0    .debug_line
 -25.9% -38.7Ki  [ = ]       0    .debug_ranges
 -21.7% -72.8Ki -21.7% -72.8Ki    .text
 -24.3%  -133Ki  [ = ]       0    .debug_info
 -23.7%  -136Ki  [ = ]       0    .debug_loc
 -21.1%  -476Ki  -0.5% -87.5Ki    TOTAL

As we can see, over 20% of the entire kernel executable was just unnecessary code generated by std.fmt, and the same goes for the executable in large, half a meg just disappeared.

I think the main issues with std.fmt today is that not enough functions are callconv(.Inline) nor noinline. I achieved this by going the extra mile to say that everything that takes comptime fmt: []const u8 is callconv(.Inline), and anything that doesn't is noinline. Just doing this gets you very far. That way the format string code generation is as specialized to the format string as possible, and it instead tries to break it down into elementary components.

I also took extra caution to always append comptime known values to the format string itself and, by doing that, combine as many value writes as possible.

Aug 27 '21 09:08 N00byEdge

For LLVM this should work fine. Probably this should be done after the planned zig.fmt overhaul in #1358 and sufficient testing.

Would this break DWARF info in self-hosting (without LLVM) for the inlined functions?

Aug 27 '21 12:08 matu3ba

When you use RdRand or RdSeed you should check the cary flag status which tells you whether or not you got a random number. So just inlining the RdRand instruction and not checking the flag is not a good plan. It'll work providing the silicon is healthy or not under physical attack, but in a security context, it's important to check the carry flag.

Feb 04 '22 05:02 dj-on-github

@dj-on-github I think your post belongs into #10755 or #10758 as this issue does not deal with random numbers.

Feb 04 '22 08:02 matu3ba

@dj-on-github providing a securely random number was not my intention, just to show what I think the code generation should be for this part of the standard library.

Feb 04 '22 12:02 N00byEdge

not sure when exactly it regressed but I looked into this recently and the initial suggestion of marking std.fmt.format as inline and seeing how that affects codegen and compilation time is currently blocked on https://github.com/ziglang/zig/issues/15668

Feb 23 '24 03:02 nektro

#15668 is fixed; anything in particular blocking this now other than just available dev time?

Apr 21 '24 06:04 Khitiara

zig zig copied to clipboard

Improve code generation of zig.fmt

zig
zig copied to clipboard