sokol-zig icon indicating copy to clipboard operation
sokol-zig copied to clipboard

how to get a callstack from a validation error?

Open nurpax opened this issue 4 months ago • 7 comments

When using sokol-zig in my application, and my code triggers a validation error, my app exits with something like this:

-------------
[sg][error][id:335] /Users/janne/.cache/zig/p/sokol-0.1.0-pb1HK_ZPLgC-2uMA63u1iiB2BqPcibWOEZVH0Hz1ursz/src/sokol/c/sokol_gfx.h:19426:0:
        VALIDATE_APIP_DEPTHSTENCILATTACHMENT_FORMAT: sg_apply_pipeline: pipeline .depth.pixel_format doesn't match sg_pass.attachments.depth_stencil image pixel format

[sg][panic][id:406] /Users/janne/.cache/zig/p/sokol-0.1.0-pb1HK_ZPLgC-2uMA63u1iiB2BqPcibWOEZVH0Hz1ursz/src/sokol/c/sokol_gfx.h:18387:0:
        VALIDATION_FAILED: validation layer checks failed

ABORTING because of [panic]

Is there a way to configure my build so that I'd get a full zig stacktrace of the call leading to the error?

nurpax avatar Sep 18 '25 16:09 nurpax

Hmm, not out of the box, but maybe by passing a different logger function into the sokol libraries. E.g. what happens is that _sg_log() is called, which in turn calls the user-provided logging function:

https://github.com/floooh/sokol/blob/74bd1cc77022586de08e72b597dfccff4a6465f4/sokol_gfx.h#L6767-L6783

...the vanilla logger function in sokol_log.h calls the C function abort() on a panic-level log message:

https://github.com/floooh/sokol/blob/74bd1cc77022586de08e72b597dfccff4a6465f4/sokol_log.h#L330-L332

You can replace this log function in the sg.setup() call, e.g.:

https://github.com/floooh/sokol-zig/blob/f54f1b00aa2b2421f06422bd35a0930ff0b17197/examples/clear.zig#L20

...and an alternative zig function would need this function signature, and probably call @panic() instead of abort() (and @panic hopefully dumps a call stack):

https://github.com/floooh/sokol-zig/blob/f54f1b00aa2b2421f06422bd35a0930ff0b17197/src/sokol/log.zig#L116

...we could probably include such an alternative log function in a manually written zig file somewhere in sokol-zig.

PR welcome ;)

floooh avatar Sep 18 '25 19:09 floooh

PS: the log function in sokol_log.h looks so weird because I want to avoid pulling in a large printf() function. In Zig this probably isn't an issue and we could use a regular formatted std.debug.print().

floooh avatar Sep 18 '25 19:09 floooh

Ok, very quick proof-of-concept in the clear.zig example:

export fn log(tag: [*c]const u8, log_level: u32, log_item: u32, msg: [*c]const u8, line_nr: u32, filename: [*c]const u8, ud: ?*anyopaque) void {
    _ = ud;
    _ = tag;
    _ = log_item;
    _ = line_nr;
    _ = filename;
    const std = @import("std");
    std.debug.print("{s}\n", .{msg});
    if (log_level == 0) {
        @panic("PANIC");
    }
}

export fn init() void {
    sg.setup(.{
        .environment = sglue.environment(),
        .logger = .{ .func = log },
    });

...and then replacing the beginPass with this to trigger a validation error:

    sg.beginPass(.{ .action = state.pass_action });

...gives the following output on macOS (e.g. looks a bit messy but the callstack is there):

sokol-zig ➤ zig build run-clear                                                                                                            git:master*
Backend: .METAL_MACOS
VALIDATE_BEGINPASS_SWAPCHAIN_EXPECT_WIDTH: sg_begin_pass: expected pass.swapchain.width > 0
VALIDATE_BEGINPASS_SWAPCHAIN_EXPECT_HEIGHT: sg_begin_pass: expected pass.swapchain.height > 0
VALIDATE_BEGINPASS_SWAPCHAIN_METAL_EXPECT_CURRENTDRAWABLE: sg_begin_pass: expected pass.swapchain.metal.current_drawable != 0
VALIDATE_BEGINPASS_SWAPCHAIN_METAL_EXPECT_DEPTHSTENCILTEXTURE: sg_begin_pass: expected pass.swapchain.metal.depth_stencil_texture != 0
VALIDATION_FAILED: validation layer checks failed
thread 968496 panic: PANIC
/Users/floh/projects/sokol-zig/examples/clear.zig:26:9: 0x100f108b3 in log (clear)
        @panic("PANIC");
        ^
???:?:?: 0x100ddac4b in __sg_log (???)
???:?:?: 0x100e31863 in __sg_validate_end (???)
???:?:?: 0x100df9aef in __sg_validate_begin_pass (???)
???:?:?: 0x100df3bfb in _sg_begin_pass (???)
/Users/floh/projects/sokol-zig/src/sokol/gfx.zig:4778:18: 0x100f10a63 in beginPass (clear)
    sg_begin_pass(&pass);
                 ^
/Users/floh/projects/sokol-zig/examples/clear.zig:45:17: 0x100f1058b in frame (clear)
    sg.beginPass(.{ .action = state.pass_action });
                ^
???:?:?: 0x100dcb393 in __sapp_call_frame (???)
???:?:?: 0x100dbc5cb in __sapp_frame (???)
???:?:?: 0x100dbc1bf in -[_sapp_macos_view drawRect:] (???)

[...lots of macOS OS functions here...]

/Users/floh/projects/sokol-zig/src/sokol/app.zig:2150:13: 0x100f102d3 in run (clear)
    sapp_run(&desc);
            ^
/Users/floh/projects/sokol-zig/examples/clear.zig:55:13: 0x100f101f3 in main (clear)
    sapp.run(.{
            ^
/Users/floh/.zvm/master/lib/std/start.zig:618:22: 0x100f10117 in main (clear)
            root.main();

floooh avatar Sep 18 '25 19:09 floooh

...a real logger function should format the message closer to the original, at least include the tag, log level and error id:

[sg][error][id:335] VALIDATE_BEGINPASS_SWAPCHAIN_EXPECT_WIDTH: sg_begin_pass: expected pass.swapchain.width > 0

...the source code location in the sokol header probably isn't all that important when you have a callstack to the function call into the sokol libraries which caused the problem.

floooh avatar Sep 18 '25 19:09 floooh

Reminder to self: also test WASM (does the Zig callstack dumper actually work there? could always conditionally just call the sokol_log.h C function from within the Zig override when target is wasm32-emscripten.

floooh avatar Sep 18 '25 19:09 floooh

Btw very related, this stale PR from a time before the runtime logging hook existed :)

https://github.com/floooh/sokol-zig/pull/33

floooh avatar Sep 18 '25 19:09 floooh

Woah! The proof of concept is already very useful! Thanks!

the source code location in the sokol header probably isn't all that important when you have a callstack to the function call into the sokol libraries which caused the problem

Yeah, not quite as useful. Still somewhat useful as I like to open the .h with the line # information into VSCode to read the validation code for the error. This usually makes it easier to understand what the error was about.

nurpax avatar Sep 18 '25 20:09 nurpax