zig
zig copied to clipboard
design flaw: semantics of loading values with undefined bits
Whilst working on #19630, it was suggested by @andrewrk to introduce the semantic rule that when loading a value from a pointer, if any bits are undefined
, the entire value is considered to be undefined
. This rule simplifies the language considerably, because it avoids the need to deal with the awkward concept of a comptime-known value with some undefined bits. However, it causes breakages in some code.
The main example I've noticed is std.PackedIntArray
, which uses a backing buffer to store a sequence of integer values bit-packed in memory. The problem is that if we initialize the buffer to undefined
, any load which crosses a boundary onto an adjacent element -- even if the element being loaded was set previously -- is considered undefined
. Thus, the only way to obey the new semantics is to always @memset
the buffer on creation, which is potentially wasteful.
This issue serves to track this design complexity and potential solutions to it. The example I brought up above could potentially be solved with runtime bit-pointers (related: #16271), but I can't say for sure whether that would solve all potential problems.
For now, this rule is never enforced at runtime, only at comptime. Workarounds have been placed in the standard library where necessary behind @inComptime()
checks (such checks link to this issue).
There's also some occurrences of what's definitely a bug: after a store, the value is turned back into its "true" type for our comptime representation. So this code fails:
test {
comptime var x: u32 = undefined;
const buf: *[4]u8 = @ptrCast(x);
buf[0] = 123;
// The byte is changed, but the value is then read back as a `u32`, so becomes `undefined`!
// So, this check fails:
try @import("std").testing.expectEqual(123, buf[0]);
}
This will be solved by adding a representation to MutableValue
for a "bundle of bytes" -- or, more accurately, a set of bit-packed values of arbitrary types. This representation will only be applicable to types which are not comptime-only.
The main example I've noticed is
std.PackedIntArray
, which uses a backing buffer to store a sequence of integer values bit-packed in memory. The problem is that if we initialize the buffer toundefined
, any load which crosses a boundary onto an adjacent element -- even if the element being loaded was set previously -- is consideredundefined
. Thus, the only way to obey the new semantics is to always@memset
the buffer on creation, which is potentially wasteful.
Not sure if it's a good idea or not, but my first thought as to a possible solution:
diff --git a/lib/std/packed_int_array.zig b/lib/std/packed_int_array.zig
index 02c721e7cf..6c8243d9c5 100644
--- a/lib/std/packed_int_array.zig
+++ b/lib/std/packed_int_array.zig
@@ -123,7 +123,7 @@ pub fn PackedIntIo(comptime Int: type, comptime endian: Endian) type {
//read existing bytes
const target_ptr: *align(1) Container = @ptrCast(&bytes[start_byte]);
- var target = target_ptr.*;
+ var target = @freeze(target_ptr.*);
if (endian != native_endian) target = @byteSwap(target);
related: https://github.com/ziglang/zig/pull/18137