v
v copied to clipboard
casting to enum result to undefined behavior
V doctor:
OS: linux, Ubuntu 20.04.2 LTS (WSL 2)
Processor: 12 cpus, 64bit, little endian, Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
CC version: cc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
getwd: /home/gladear/workspace
vmodules: /home/gladear/.vmodules
vroot: /home/gladear/workspace/v
vexe: /home/gladear/workspace/v/v
vexe mtime: 2021-05-17 17:13:18
is vroot writable: true
is vmodules writable: true
V full version: V 0.2.2 a9435f3.4c22370
Git version: git version 2.25.1
Git vroot status: weekly.2021.19-54-g4c223706 (1 commit(s) behind V master)
.git/config present: true
thirdparty/tcc status: thirdparty-linux-amd64 7543de81
What did you do?
v -g -o vdbg cmd/v && vdbg enum.v
enum Color { red green blue }
fn color_code(c Color) int {
return match c {
.red { 10 }
.green { 20 }
.blue { 30 }
}
}
fn color_code2(c Color) int {
match c {
.red {
return 10
}
.green {
return 20
}
.blue {
return 30
}
}
}
println(color_code(Color(5)))
println(color_code2(Color(5)))
What did you expect to see?
An error telling me I can't cast to an enum, or at least same result for both function call
What did you see instead?
30
0
I think that casting ints to an enum should be allowed in unsafe {} blocks.
The result of casting ints that are outside the allowed enum values is indeed undefined though, and V's codegen does not expect such enum values 🤔 .
Casting ints to enums is fine, without unsafe{}. It's a very common operation.
Casting ints that are outside should be checked at compile time and runtime
A common operation ? I'm pretty sure it's impossible in a lot of high level languages.
If it is to be done, I propose a "static method" on enums, like Color.from or Color.from to make it clear it's more than just a cast, and that there is a runtime cost
So @medvednikov asked me to take a look at this and I had some thoughts I'm looking for feedback on:
I think casting to an enum from an int like in the example is something that is absolutely desirable. It's used all the time in protocols and low level dev.
The part that looks like the bug to me is that the second example returns 0. There's no good reason it should be doing that, since match is supposed to be exhaustive. Looking at the generated C, the reason why is obvious.
- In the first case, blue is the fallback value by nature of it being last in the list of matches. So any unrecognized value will be interpreted as blue.
- In the second case, there are no match so an empty value is returned.
Both of these behaviors are defensible, however only one of them should manifest.
Ideally one of the following policies should be in place (in no particular order):
- coerce the out of range value into an in-rage value via modulo or similar strategy
- default value for all out of range enum checks
- exceptional value for a given enum (similar to above, but more judgmental)
- explicit check and error at cast site (but I don't think V does this for any other type?)
- alternatively, match could just require an else for enums
The least disruptive change to the compiler code would probably be to default to the final entry in the match for both cases, however it almost certainly will conceal real bugs in the end-user programmer's code which could be a headache to debug. At the very least it would need to be documented.
Whether an operation is common or not (in my personal opinion, it is not common in general, but very niche, but I digress), has no relevance on whether it is safe or unsafe to do.
The fact is that Something(9999) where enum Something { a b c }, will produce an invalid value for the enum -> it violates a language guarantee => it is unsafe, just like pointer arithmetic is unsafe etc.
I think casting to an enum from an int like in the example is something that is absolutely desirable.
I agree, it is desirable. It would just need to be written like: x := unsafe { Something(an_integer_value) }, instead of just x := Something(an_integer_value), so all future readers/maintainers can be alerted in the future that it needs more attention.
I think casting to an enum from an int like in the example is something that is absolutely desirable.
I agree, it is desirable. It would just need to be written like:
x := unsafe { Something(an_integer_value) }, instead of justx := Something(an_integer_value), so all future readers/maintainers can be alerted in the future that it needs more attention.
I think we're in alignment that just doing things behind the scenes without a coherent error is bad. An explicit unsafe here would certainly be a solution that I didn't mention.
It immediately makes me desire a safe function as well.
I am inclined to go with your suggestion to require unsafe but the underlying mechanism that generated the different code paths I believe would persist, so the actual bug may still yet rear it's head.
So I propose 3 actions:
- Both code paths will return an empty value, fixing this underlying inconsistency.
- An
unsafeenum conversion requirement, likely in a different PR. - A separate feature which does a safe int->enum conversion, which returns an optional type.
I am re-opening this, because the merged PR implements only part of what is needed.
enum Color { red green blue }
fn color_code(c Color) int {
return match c {
.red { 10 }
.green { 20 }
.blue { 30 }
}
}
fn color_code2(c Color) int {
match c {
.red {
return 10
}
.green {
return 20
}
.blue {
return 30
}
}
}
fn main() {
n := 5
println(color_code(Color(n)))
println(color_code2(Color(n)))
}
PS D:\Test\v\tt1> v run .
30
0