mach icon indicating copy to clipboard operation
mach copied to clipboard

sysgpu: Index OOB in shader compiler

Open msg-programs opened this issue 9 months ago • 3 comments
trafficstars

PS B:\c-workspace\mode8> zig build run-debug
debug(mach): primary monitor work topleft=0,0 size=1280x720
info(mach): found D3D12 backend on Integrated GPU adapter: Intel(R) UHD Graphics 620, 

poweron
thread 2612 panic: index out of bounds: index 2863311530, len 127
C:\Users\missing\AppData\Local\zig\p\122092af23a9e9346b15032fc6ff896630a3729209f36efa1e7881363c3342303eca\src\sysgpu\shader\AstGen.zig:4428:59: 0x7ff7f8949996 in getInst (debug.exe.obj)
    return astgen.instructions.entries.slice().items(.key)[@intFromEnum(inst)];
                                                          ^
C:\Users\missing\AppData\Local\zig\p\122092af23a9e9346b15032fc6ff896630a3729209f36efa1e7881363c3342303eca\src\sysgpu\shader\AstGen.zig:2543:58: 0x7ff7f8a23ea7 in genStructConstruct (debug.exe.obj)
            if (try astgen.coerce(arg_res, astgen.getInst(struct_members[i]).struct_member.type)) {
                                                         ^
C:\Users\missing\AppData\Local\zig\p\122092af23a9e9346b15032fc6ff896630a3729209f36efa1e7881363c3342303eca\src\sysgpu\shader\AstGen.zig:1880:63: 0x7ff7f8a525be in genCall (debug.exe.obj)
                .@"struct" => return astgen.genStructConstruct(scope, decl, node),
                                                              ^
C:\Users\missing\AppData\Local\zig\p\122092af23a9e9346b15032fc6ff896630a3729209f36efa1e7881363c3342303eca\src\sysgpu\shader\AstGen.zig:1383:32: 0x7ff7f8949776 in genExpr (debug.exe.obj)
        .call => astgen.genCall(scope, node),
                               ^
C:\Users\missing\AppData\Local\zig\p\122092af23a9e9346b15032fc6ff896630a3729209f36efa1e7881363c3342303eca\src\sysgpu\shader\AstGen.zig:1141:35: 0x7ff7f8a62913 in genCompoundAssign (debug.exe.obj)
    const rhs = try astgen.genExpr(scope, node_rhs);
                                  ^
C:\Users\missing\AppData\Local\zig\p\122092af23a9e9346b15032fc6ff896630a3729209f36efa1e7881363c3342303eca\src\sysgpu\shader\AstGen.zig:885:57: 0x7ff7f8a69ccc in genStatement (debug.exe.obj)
        .compound_assign => try astgen.genCompoundAssign(scope, node),
                                                        ^
C:\Users\missing\AppData\Local\zig\p\122092af23a9e9346b15032fc6ff896630a3729209f36efa1e7881363c3342303eca\src\sysgpu\shader\AstGen.zig:872:46: 0x7ff7f89519be in genBlock (debug.exe.obj)
        const stmnt = try astgen.genStatement(scope, stmnt_node);
                                             ^
C:\Users\missing\AppData\Local\zig\p\122092af23a9e9346b15032fc6ff896630a3729209f36efa1e7881363c3342303eca\src\sysgpu\shader\AstGen.zig:558:38: 0x7ff7f8950603 in genFn (debug.exe.obj)
    const block = try astgen.genBlock(scope, node_rhs);
                                     ^
C:\Users\missing\AppData\Local\zig\p\122092af23a9e9346b15032fc6ff896630a3729209f36efa1e7881363c3342303eca\src\sysgpu\shader\AstGen.zig:73:40: 0x7ff7f895255b in genTranslationUnit (debug.exe.obj)
                break :blk astgen.genFn(root_scope, node, false) catch |err| switch (err) {
                                       ^
C:\Users\missing\AppData\Local\zig\p\122092af23a9e9346b15032fc6ff896630a3729209f36efa1e7881363c3342303eca\src\sysgpu\shader\Air.zig:58:56: 0x7ff7f8912831 in generate (debug.exe.obj)
    const globals_index = try astgen.genTranslationUnit();
                                                       ^
B:\c-workspace\mode8\src\magic_smoke\ppu_pipeline.zig:72:60: 0x7ff7f888d80c in setupPipeline__anon_40320 (debug.exe.obj)      
    const ppu_module = window.device.createShaderModuleWGSL("ppu.wgsl", @embedFile("ppu.wgsl"));
                                                           ^

Note: 2863311530 is 0xAAAAAAAA, so it seems like the index is undefined. I'd love to add a MRE, but I've no idea what might cause this and the shader is quite big. I'll attach it below in the hope that it helps.

ppu.wgsl

msg-programs avatar Jan 30 '25 20:01 msg-programs

Tried getting a bit more info on what's going wrong. I've added some more detail as to how I've gone about this so that people can double check.

// Tokenizer.zig, Line 414
if (result.loc.start > 8000 and result.loc.start < 9500) {
    const extra = result.loc.extraInfo(tokenizer.source);
    std.debug.print("Token \"{s: >25}\" at L{d: >4}:{d: >3} (internally {d: >5}..{d: >5}) type {}\n", .{ result.loc.slice(tokenizer.source), extra.line, extra.col, result.loc.start, result.loc.end, result.tag });
}

// AstGen.zig, Line 1873
std.debug.print("{} \n\t{} \n\t{} \n\t{} \n\t{} \n\t{}\n\n", .{ token, token_tag, token_loc, node_lhs, node_rhs, node_loc });

// AstGen.zig, Line 2545
std.debug.print("{} \n\t{} \n\t{} \n\t{} \n\t{}\n\n", .{ arg, arg_res, struct_members[i], i, node_loc });

The last two print stmts print the following before the crash happens:

// <snip>
sysgpu.shader.Ast.TokenIndex(1682)
        sysgpu.shader.Token.Tag.k_array
        sysgpu.shader.Token.Loc{ .start = 9257, .end = 9262 }
        sysgpu.shader.Ast.NodeIndex(816)
        sysgpu.shader.Ast.NodeIndex(780)
        sysgpu.shader.Token.Loc{ .start = 9257, .end = 9262 }

sysgpu.shader.Air.InstIndex(126)
        sysgpu.shader.Air.InstIndex(126)
        sysgpu.shader.Air.InstIndex(2863311530)
        5
        sysgpu.shader.Token.Loc{ .start = 9055, .end = 9067 }

This seems to reference these tokens, as printed by the first stmt:

// <snip>
Token "             CompSettings" at L 281: 21 (internally  9055.. 9067) type sysgpu.shader.Token.Tag.ident
// <snip>
Token "                    array" at L 287:  9 (internally  9257.. 9262) type sysgpu.shader.Token.Tag.k_array
// <snip>

Not very useful for finding the cause, but knowing where it happens is a start for now.

msg-programs avatar Jan 31 '25 20:01 msg-programs

Didn't get any further with debugging, but based on the insights I managed boil the shader down to this snippet that seems to crash in the same way:

struct Foo {
    a: array<u32, 2>,
    b: array<u32, 2>
};

fn bar() {
    let foo = Foo(array(1, 2), array(3, 4));
}

msg-programs avatar Feb 03 '25 18:02 msg-programs

Found the issue, it's in here: https://github.com/hexops/mach/blob/b14f8e69ee8eb834695eb0d0582053e555d10156/src/sysgpu/shader/AstGen.zig#L2526-L2544 L2533 stores a slice of astgen.refs.items named struct_members and uses it later in L2543. The problem is that astgen.genExpr() in L2541 may invalidate this slice; in the MRE it's via genExpr --> genCall --> node_rhs != .none --> token_tag == .k_array -->node_lhs != .none --> astgen.addRefList() --> astgen.refs.ensureUnusedCapacity().

I'll submit a PR with a fix later.

msg-programs avatar Feb 21 '25 17:02 msg-programs