Segfault when accessing a union in a tuple via `Any`
Repro
Minimal version
actor Main
let env: Env
new create(env': Env) =>
env = env'
let x = ("hello", USize(123))
match_x(x) // segfault
fun match_x(x: Any val) =>
env.out.print("match_x(x: Any val): ")
// segfault
match x
| let s: Stringable => env.out.print("\t" + s.string())
| (let a: Stringable, let b: (USize | U64)) =>
env.out.print("\t(" + a.string() + ", " + b.string() +")")
else
env.out.print("\tno-match")
end
Some more cases
actor Main
let env: Env
new create(env': Env) =>
env = env'
let str = "string"
env.out.print("param is String: " + str)
// works fine
env.out.print("inline match")
match str
| let s: Stringable => env.out.print("\t" + s.string())
end
match_usize(str) // fine
match_number(str) // fine
match_stringable(str) // fine
let x = ("hello", USize(123))
env.out.print("\n\nparam is (String, USize): (" + x._1 + ", " + x._2.string()
+ ")")
// works fine
env.out.print("inline match")
match x
| (let a: Stringable, let b: Stringable) =>
env.out.print("\t(" + a.string() + ", " + b.string() +")")
end
match_usize(x) // fine
match_number(x) // segfault
match_stringable(x) // segfault
fun match_stringable(p: Any val) =>
env.out.print("match_stringable(p: Any): ")
match p
| let s: Stringable => env.out.print("\t" + s.string())
| (let a: Stringable, let b: Stringable) =>
env.out.print("\t(" + a.string() + ", " + b.string() + ")")
else
env.out.print("\tno-match")
end
fun match_number(p: Any val) =>
env.out.print("match_number(p: Any): ")
match p
| let s: Stringable => env.out.print("\t" + s.string())
| (let a: Stringable, let b: Number) =>
env.out.print("\t(" + a.string() + ", " + b.string() + ")")
else
env.out.print("\tno-match")
end
fun match_usize(p: Any val) =>
env.out.print("match_usize(p: Any): ")
match p
| let s: Stringable => env.out.print("\t" + s.string())
| (let a: Stringable, let b: USize) =>
env.out.print("\t(" + a.string() + ", " + b.string() + ")")
else
env.out.print("\tno-match")
end
OS: MacOS, M3 Pony version
% ponyc -v
0.58.3-cb2f814b [debug]
Compiled with: LLVM 15.0.7 -- AppleClang-15.0.0.15000309-arm64
Defaults: pic=true
Backtrace (of the minimal version)
% lldb -- ./12_segfault_tuple
(lldb) target create "./12_segfault_tuple"
Current executable set to 'scratch/pony/12_segfault_tuple/12_segfault_tuple' (arm64).
(lldb) r
Process 21124 launched: 'scratch/pony/12_segfault_tuple/12_segfault_tuple' (arm64)
match_x(x: Any val):
Process 21124 stopped
* thread #4, stop reason = EXC_BAD_ACCESS (code=1, address=0x7b)
frame #0: 0x000000010000537c 12_segfault_tuple`___lldb_unnamed_symbol3694 + 372
12_segfault_tuple`___lldb_unnamed_symbol3694:
-> 0x10000537c <+372>: ldr x8, [x0]
0x100005380 <+376>: b 0x100005418 ; <+528>
0x100005384 <+380>: ldr x0, [x8, x10]
0x100005388 <+384>: ldr x8, [x0]
Target 0: (12_segfault_tuple) stopped.
(lldb) bt
* thread #4, stop reason = EXC_BAD_ACCESS (code=1, address=0x7b)
* frame #0: 0x000000010000537c 12_segfault_tuple`___lldb_unnamed_symbol3694 + 372
frame #1: 0x0000000100009518 12_segfault_tuple`handle_message(ctx=0x00000001084e9e08, actor=0x00000001084e8c00, msg=0x00000001084e9300) at actor.c:507:7
frame #2: 0x0000000100008b20 12_segfault_tuple`ponyint_actor_run(ctx=0x00000001084e9e08, actor=0x00000001084e8c00, polling=false) at actor.c:592:20
frame #3: 0x00000001000230c8 12_segfault_tuple`run(sched=0x00000001084e9dc0) at scheduler.c:1075:23
frame #4: 0x0000000100022500 12_segfault_tuple`run_thread(arg=0x00000001084e9dc0) at scheduler.c:1127:3
frame #5: 0x00000001804e6f94 libsystem_pthread.dylib`_pthread_start + 136
The match itself isn't the issue. That is fine. In the original example.
This works:
actor Main
let env: Env
new create(env': Env) =>
env = env'
let x = ("hello", USize(123))
match_x(x) // segfault
fun match_x(x: Any val) =>
env.out.print("match_x(x: Any val): ")
// segfault
match x
| let s: Stringable => env.out.print("\t" + s.string())
| (let a: Stringable, let b: USize) =>
env.out.print("\t(" + a.string() + ", " + b.string() +")")
else
env.out.print("\tno-match")
end
So does this:
actor Main
let env: Env
new create(env': Env) =>
env = env'
let x = ("hello", USize(123))
match_x(x) // segfault
fun match_x(x: Any val) =>
env.out.print("match_x(x: Any val): ")
// segfault
match x
| let s: Stringable => env.out.print("\t" + s.string())
| (let a: Stringable, let b: (USize | U64)) =>
env.out.print("hello")
else
env.out.print("\tno-match")
end
The issue in the first match is with the combination of a match through the union AND the message send. What exactly is open to question.
The second example is the same issue. It is matching on Number that is type Number is (Int | Float).
So there's something with the tuple being matched against a union and then PROBABLY the usage of the matched in a message send. More investigation is needed to know if it is message send is in fact fully required. Simple testing says yes.
Here's a more minimal example:
actor Main
let _env: Env
new create(env: Env) =>
_env = env
match_x(("hello", USize(123)))
fun match_x(x: Any val) =>
match x
| (let a: Stringable, let b: (USize | U64)) =>
_env.out.print("\t(" + b.string() +")")
end
Note the following where the match is against the known tuple types is fine:
actor Main
let _env: Env
new create(env: Env) =>
_env = env
match_x(("hello", USize(123)))
fun match_x(x: (Stringable, (USize | U64))) =>
match x
| (let a: Stringable, let b: (USize | U64)) =>
_env.out.print("\t(" + b.string() +")")
end
Here's an even more minimal example:
actor Main
new create(env: Env) =>
let x: Any val = ("hello", (USize(123)))
match x
| (let a: Stringable, let b: (USize | U64)) =>
env.out.print(b.string())
end
The message send isn't required. Here's a still more minimal example:
actor Main
new create(e: Env) =>
let x: Any val = ("a", (U8(1)))
match x
| (let a: String, let b: (U8 | U16)) =>
b.string()
end
At this point what seems to be important is the Any val when matching and the union type that we for b and then, calling string on the matched value.
Whoever picks this up can work more from there.
Discussed in the sync call.
Listen to the sync call recording for a more detailed explanation, but what's going on here is:
xis a "boxed" tuple (stored via a layer of indirection as an object pointer with a type descriptor)- the 2nd element of
xis not "boxed" - it is a raw U8 value of 1 - the match expression unpacks the "boxed" tuple to a new tuple with a slightly different shape
- the different shape has the 2nd element as being a "boxed" union (meaning it is a pointer with type descriptor rather than a raw value)
- the mismatch in expectations for that 2nd element causes the compiled code to try to "dereference"
bas if it were a pointer, but it isn't - it's just a U8 value of 1 with some junk data in the other bits - dereferencing the non-pointer as a pointer goes boom (segfault)
To fix this, we would need the compiled code inside the match to recognize the subtle distinction between the two different tuple shapes, and generate different code for different tuple shape cases, with the one case we need here being a case where we need to generated a boxed pointer for b, before we can try to treat it as a pointer.
This is non-trivial, but should be possible.