Odin icon indicating copy to clipboard operation
Odin copied to clipboard

mu.init() hangs the compiler.

Open sigmawq opened this issue 3 years ago • 11 comments

Context

    Odin: dev-2022-06-nightly:ba5f7c4e
    OS:   Windows 10 Home Basic (version: 21H2), build 19044.1706
    CPU:  Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
    RAM:  16325 MiB

Expected Behavior

This code should compile with the -debug flag:

package main

import mu "vendor:microui"

main :: proc() {
	mu_ctx: mu.Context
	mu.init(&mu_ctx)
}

Current Behavior

This code hangs the compiler if being compiled with the -debug flag. Tried several LLVM versions and still no success.

Steps to Reproduce

Try to compile the code above with the -debug flag.

sigmawq avatar Jun 12 '22 18:06 sigmawq

Confirmed. Hangs in LLVM-C.dll. image image

Kelimion avatar Jun 12 '22 18:06 Kelimion

I'm guessing I know exactly what is causing this:

LLVM doesn't like large aggregate types.

gingerBill avatar Jun 21 '22 10:06 gingerBill

At least of commit 1676c643dfd9ef45f2aaa4dfddb69cea4bcf80fc, on LLVM 14.0.6, this causes the following LLVM asserts to fail:

    assert(NumValues == VTs.NumVTs && "NumValues wasn't wide enough for its operands!");

Where NumValues == 3153 and VTs.NumVTs == 265297.

  assert(SDNode::getMaxNumOperands() >= Vals.size() && "too many operands to fit into SDNode");

Vals.length is 265297, and I presume size() is a getter method of that field. RemedyBG doesn't reveal what SDNode::getMaxNumOperands() is though.

All this is followed by a chain of failed asserts that depend on these conditions not being met.

A more minimal repo is required to make forward progress on this issue. There's no way I know of to force LLVM to validate each new line of IR as its being written. Currently, the whole is lowered at once in a call to LLVMTargetMachineEmitToFile. Even though I have a complete stack frame picture of LLVM internals, I have absolutely no idea how this maps to Odin source.

JesseRMeyer avatar Jun 30 '22 18:06 JesseRMeyer

I think I found a minimal repro. It looks like compiler hangs when you try zero a large struct:

package main

VeryLargeStruct :: struct {
    data: [1024 * 100]u8,
}

main :: proc() {
    d: VeryLargeStruct = {}
}

When you delete = {} it compiles normally. I looks like it is not a infinity loop but the larger the struct the longer the compilation gets. And since microui.Context is very large (around 260 KB) zeroing it takes a lot of time.

Another thing I spotted, when I create and zero microui.Context in the main(), I get warning: Warning: Declaration of 'ctx' may cause a stack overflow due to its type 'Context' having a size of 271448 bytes, but when that happens in the init() function, the warning isn't shown.

It looks like the workarond for now is not using = {} on large structs, alternativly using mem.zero()

Caedo avatar Jul 15 '22 19:07 Caedo

This is mostly a limitation of LLVM, it has issues zero initializing large structures since it generates code inline to zero them out, rather than calling on a function which explicitly zeros it like mem.zero here.

Generally we should be looking at the size ourselves here and calling on a zeroing function if it's too large rather than relying on LLVM. I wouldn't really call this a bug on their end either as it's a design choice for optimization purposes to want to inline small zeroing. They instead leave it up to language designers to chose what happens with large types instead which we just don't handle yet.

In general though your work around is what the compiler would have to do anyways so continue using it. It's generally not a good idea to use trivial initialization for large types though, even if you know the compiler will do the right thing here and not inline megabytes of mov instructions to zero it out, just because it might and that'll obviously be slower than an optimized memory zeroing function.

graphitemaster avatar Jul 16 '22 02:07 graphitemaster

What's interesting is that if you look at the generated IR, it does it correctly:

; Function Attrs: nounwind
define internal void @main.main(i8* noalias nocapture nonnull %__.context_ptr) #0 {
decls:
  %0 = alloca %main.VeryLargeStruct, align 1
  br label %entry

entry:                                            ; preds = %decls
  %1 = bitcast %main.VeryLargeStruct* %0 to i8*
  call void @llvm.memset.p0i8.i64(i8* align 1 %1, i8 0, i64 102400, i1 false)
  ret void
}

gingerBill avatar Jul 16 '22 15:07 gingerBill

The bug only occurs when using -debug. Therefore it is fundamentally a bug in LLVM that seems like it is not fixable by our side, even if use runtime.memset instead of the LLVM one.

gingerBill avatar Jul 16 '22 15:07 gingerBill

The bug only occurs when using -debug. Therefore it is fundamentally a bug in LLVM that seems like it is not fixable by our side, even if use runtime.memset instead of the LLVM one.

Maybe. This assumes the compiler's generated debug information is being produced correctly, for some definition of correct. I wonder if the compiler is generating 1024 * 100 debug entries for this assignment (but, maybe, should instead compress it down to 1). We've seen LLVM underflow bugs recently, and the fact that 3153 is so much smaller than 265297 makes me wonder if we're seeing something similar can may be worked around in the front-end.

JesseRMeyer avatar Jul 17 '22 13:07 JesseRMeyer

Interestingly the minimal repro in https://github.com/odin-lang/Odin/issues/1843#issuecomment-1185839425 doesn't crash on LLVM 15 with -debug

But this does:

package main

VeryLargeStruct :: struct {
    vals: [65536]u8, // change to 65535 to avoid crash. The datatype doesn't matter.
}

main :: proc() {
    d := VeryLargeStruct {vals={0=1}}
}

It's basically the same problem as described in https://github.com/odin-lang/Odin/issues/1843#issuecomment-1186073867, just under a different disguise.

IanLilleyT avatar Aug 10 '22 00:08 IanLilleyT

My guess is that https://github.com/odin-lang/Odin/commit/e6ab4f48567175cc192a658fb6d7b067e38912d8 needs to be extended to support memset of non-zero values.

JesseRMeyer avatar Aug 10 '22 16:08 JesseRMeyer

Oh wait did that commit fix the zeroing slowness / crash that this issue is about and I confused it with LLVM fixing the problem?

Anyway, take a look at https://gcc.godbolt.org/z/fnx5v6415 for how C handles the partially filled store

IanLilleyT avatar Aug 10 '22 16:08 IanLilleyT