Crash if LHS is invalidated during RHS evaluation
In assignments:
var p: ^int
fn foo(): int {
p = null
return 666
}
fn main() {
p = new(int, 42)
p^ = foo() // Crash!
printf("%llv\n", p)
}
In expressions:
type P = struct {
x: int
}
fn foo(p: ^^P): P {
v := p^^
p^ = null
return v
}
fn main() {
p := &P{5}
if p^ == foo(&p) { // Crash!
printf("Equal\n")
} else {
printf("Not equal\n")
}
}
In built-in function calls:
var a: ^[]int
fn foo(): int {
a = null
return 0
}
fn main() {
a = new([]int)
b := slice(a^, foo()) // Crash!
printf("%v %v\n", a^, b)
}
In indexing:
var p: ^[2]int
fn foo(): int {
p = null
return 0
}
fn main() {
p = new([2]int)
p[foo()] = 666 // Crash!
printf("OK\n")
}
Does crash in this case mean that the VM crashes instead of the umka program?
@marekmaskarinec
Short answer: yes.
To be honest, I don't know how these examples should behave. None of them is sane Umka code. They should either silently produce meaningless results (i.e. results that immediately get discarded, as they are no longer attached to identifiers), or trigger null pointer runtime errors, but not crash the interpreter, of course.
These examples violate the fundamental memory safety guarantee:
- If a (strong) pointer exists anywhere, it must be valid
- For a pointer to be valid, it must be counted as +1 ref
It follows that any pointer being pushed onto the stack should increment the ref count, even if it's not attached to an identifier. However, if I implemented this principle literally, in a brute-force manner, I would kill performance.
I'm feeling hostage of my "correctness first" ideal. Had I discovered these examples five years ago, I could have given up ref counting in favor of a tracing garbage collector scanning the whole stack.
All these examples produce a nil pointer runtime error in Go.
It's curious that even some more straightforward examples that seem to be correct still fail:
https://go.dev/play/p/xEO_oAc3k4-
According to the Go spec, they may or may not fail because the order of evaluation is fixed only for function calls, <-, || and &&, but not for other operators:
Otherwise, when evaluating the operands of an expression, assignment, or return statement, all function calls, method calls, receive operations, and binary logical operations are evaluated in lexical left-to-right order.
Graceful termination is good, undefined behavior is not.
Would forcing this to be done in 2 statements fix it? Like, only allowing simple variables to be assigned to a pointer while de-referencing it:
// p is global
p = new(int, 42)
x := foo()
p^ = x
// "p^ = foo()" doesn't compile
This would be more inconvenient and not be backwards compatible so maybe not an option.
@Xceptionull It would. But what to do with the other examples? I don't want to impose so many artificial restrictions just to work around the memory management defect. Thankfully, this defect never manifests itself in normal Umka code.
A general restriction could be: No function call on a statement with pointer de-reference. But honestly, I think a runtime error instead of adding this restriction would be better personally. This is only if you want to ensure correctness over all else.