umka-lang icon indicating copy to clipboard operation
umka-lang copied to clipboard

Crash if LHS is invalidated during RHS evaluation

Open vtereshkov opened this issue 5 months ago • 9 comments

In assignments:

var p: ^int

fn foo(): int {
    p = null
    return 666
}

fn main() {
    p = new(int, 42)
    p^ = foo()                // Crash!
    printf("%llv\n", p)
}

vtereshkov avatar Jul 19 '25 23:07 vtereshkov

In expressions:

type P = struct {
    x: int
}

fn foo(p: ^^P): P {
    v := p^^
    p^ = null
    return v
}

fn main() {
    p := &P{5}

    if p^ == foo(&p) {          // Crash!
        printf("Equal\n")
    } else {
        printf("Not equal\n")
    }
}

vtereshkov avatar Jul 20 '25 12:07 vtereshkov

In built-in function calls:

var a: ^[]int

fn foo(): int {
    a = null
    return 0
}

fn main() {
    a = new([]int)
    b := slice(a^, foo())    // Crash!
    printf("%v %v\n", a^, b)
}

vtereshkov avatar Jul 21 '25 22:07 vtereshkov

In indexing:

var p: ^[2]int

fn foo(): int {
    p = null
    return 0
}

fn main() {
    p = new([2]int)
    p[foo()] = 666                // Crash!
    printf("OK\n")
}

vtereshkov avatar Jul 22 '25 15:07 vtereshkov

Does crash in this case mean that the VM crashes instead of the umka program?

marekmaskarinec avatar Jul 22 '25 16:07 marekmaskarinec

@marekmaskarinec

Short answer: yes.

To be honest, I don't know how these examples should behave. None of them is sane Umka code. They should either silently produce meaningless results (i.e. results that immediately get discarded, as they are no longer attached to identifiers), or trigger null pointer runtime errors, but not crash the interpreter, of course.

These examples violate the fundamental memory safety guarantee:

  • If a (strong) pointer exists anywhere, it must be valid
  • For a pointer to be valid, it must be counted as +1 ref

It follows that any pointer being pushed onto the stack should increment the ref count, even if it's not attached to an identifier. However, if I implemented this principle literally, in a brute-force manner, I would kill performance.

I'm feeling hostage of my "correctness first" ideal. Had I discovered these examples five years ago, I could have given up ref counting in favor of a tracing garbage collector scanning the whole stack.

vtereshkov avatar Jul 22 '25 17:07 vtereshkov

All these examples produce a nil pointer runtime error in Go.

It's curious that even some more straightforward examples that seem to be correct still fail:

https://go.dev/play/p/xEO_oAc3k4-

According to the Go spec, they may or may not fail because the order of evaluation is fixed only for function calls, <-, || and &&, but not for other operators:

Otherwise, when evaluating the operands of an expression, assignment, or return statement, all function calls, method calls, receive operations, and binary logical operations are evaluated in lexical left-to-right order.

Graceful termination is good, undefined behavior is not.

vtereshkov avatar Jul 22 '25 22:07 vtereshkov

Would forcing this to be done in 2 statements fix it? Like, only allowing simple variables to be assigned to a pointer while de-referencing it:

// p is global
p = new(int, 42)
x := foo()
p^ = x
// "p^ = foo()" doesn't compile

This would be more inconvenient and not be backwards compatible so maybe not an option.

Xceptionull avatar Sep 22 '25 08:09 Xceptionull

@Xceptionull It would. But what to do with the other examples? I don't want to impose so many artificial restrictions just to work around the memory management defect. Thankfully, this defect never manifests itself in normal Umka code.

vtereshkov avatar Sep 22 '25 09:09 vtereshkov

A general restriction could be: No function call on a statement with pointer de-reference. But honestly, I think a runtime error instead of adding this restriction would be better personally. This is only if you want to ensure correctness over all else.

Xceptionull avatar Sep 22 '25 09:09 Xceptionull