`and` cannot operate on vectors of bool despite the manual claiming otherwise
Zig Version
0.11.0-dev.1300+7cb2f9222
Steps to Reproduce and Observed Behavior
zig test this code:
test {
const bv = @Vector(4, bool){false, true, false, true};
const ma = @Vector(4, bool){true, true, false, false};
_ = bv and ma;
}
An error is output:
error: expected type 'bool', found '@Vector(4, bool)'
_ = bv and ma;
^~
Expected Behavior
The manual says that
Vectors support the same builtin operators as their underlying base types.
This isn't the case for and. Based on the manual, I expected that bv and ma would evaluate to @Vector(4, bool){false, true, false, false}.
The same applies to or and !.
I'm surprised to see this relabeled from documentation issue to implementation bug, so I'll write down my perspective/interpretation to maybe start the discussion of what the desired resolution should be.
Background
First of all, there are two variants of the mathematical operations named here in Zig:
- defined on bits ("bitwise"):
~,|,& - defined on booleans:
!,or,and
The unary operations are less interesting, while ~ is defined to flip every bit in an integer, for a boolean it is defined to exchange true and false.
While this seems identical if a boolean is viewed to have a single backing bit, in hardware they might be larger (say an entire byte or register),
giving more freedom in how the two states are represented, which is why the compiler might choose to emit different instructions for each.
The binary operations have a more important, fundamental distinction, in that or and and are short-circuiting.
That means that the right-hand operand is only evaluated (its calculation executed) if the result isn't dictated by the left-hand operand:
x==true |
x==false |
|
|---|---|---|
x and f() |
f() is executed |
f() not executed (always results in false) |
x or f() |
f() not executed (always results in true) |
f() is executed |
(Note: This leads to the neat consistency that Zig's keyword operators (if, catch, etc.) influence control flow, while symbol operators (+, !, etc.) don't.)
Now Zig defines operations on vectors to generally work element-wise.
So a = b + c with 4-element vectors results in a vector holding the sum of each element, so a[i] == b[i] + c[i].
Vector types are meant to express SIMD (Single Instruction (with/on) Multiple Data) operations, and the compiler will try to lower them using appropriate instructions (where reasonably efficient).
However, while SIMD instruction sets have traditionally been good at expressing parallel operations,
expressing "element-wise optional" operations (f.e. by masking some elements) is more complex.
Some instruction sets give more tools for this, and some architectures and implementations are more efficient at this, than others.
Now the crux is this: To generalize short-circuiting operations and and or to vector elements, we need to decide on their impact on control flow.
I see 3 options for this (using example snippet bv and f()):
- (easy, but imo less regular) Make them non-short-circuit.
Always fully evaluate the right-hand side
f(), then do the binary operation. - (a litte more costly to implement) Make them short-circuit only when the right-hand side
f()would be fully discarded. This means we would need to decide control flow based on whether left-hand sidebvis fullytrueforor, and fullyfalseforand. - (very complex, but imo the most regular) Make them element-wise short-circuit. Emit a compiler error where this is impossible, either due to theoretical limitations (ambiguity) or practical limitations in the compiler (f.e. external code).
Currently the operation isn't implemented, and afaik what Zig should choose to do hasn't been decided yet. While 3 would be really neat for aesthetic reasons (i.e. sense of accomplishment), and although there might be advantages in certain scenarios, implementing that behavior would surely be a technical challenge (and I won't go into even more detail unprovoked).
I personally dislike option 1 for the reason that, if we allowed bitwise operators ~, |, and & on bool values with non-surprising semantics,
then they would be exactly equivalent to option 1 above.
Having both types of operators map to the same operation here, on top of maybe being surprising, nets us no benefit over just not supporting or and and on vectors,
which is why I'd personally prefer to see option 2 AND bitwise operations on bool values implemented.
I think Option 1 Make them non-short-circuit is the best option.
I personally think of vector instructions like this:
z = x() + y();
// is conceptually equivalent to doing this:
var temp1 = x();
var temp2 = y();
z[0] = temp1[0] + temp2[0];
z[1] = temp1[1] + temp2[1];
...
When thinking like this the and operator automatically becomes non-short-circuit:
z = x() and y();
// is conceptually equivalent to doing this:
var temp1 = x();
var temp2 = y();
z[0] = temp1[0] and temp2[0];
z[1] = temp1[1] and temp2[1];
...
So I would be really surprised if in a small number of cases y() wasn't evaluated.
While Option 2 doesn't fit my mental model, maybe it's at least a useful optimization?
Let's consider the code x or f(y)
How likely is it that a single bool x is true?
50% → we need to evaluate f(y) 50% of the time.
How likely is it that a @Vector(8, bool) x is true in every field?
6% → we need to evaluate f(y) 94 % of the time.
This is not worth in my opinion. Especially considering that on arm @reduce() takes a lot of operations.
So for small f(y) short-circuiting would be significantly slower on arm.
While on the other hand even for the biggest possible f(y) we can't get more than a 6% improvement on average.
And I don't think we need to consider option 3. It defeats the point of vector operations and it would be super complicated to implement.
The manual says that
Vectors support the same builtin operators as their underlying base types.
and and or are not operators, they're control flow keywords.
Making them not short circuit on vectors would be inconsistent with how they normally work, which is very confusing. As for having them do partial short circuiting, that's a potentially-expensive, hidden operation, which is very much in opposition to Zig's design.
My suggestion would be to make & and | work on bools and vectors of bools as non-short-circuiting operations, and keep and and or only for single bools.
andandorare not operators, they're control flow keywords.
and and or are listed in the table of operators, so I would assume that they are operators.
While they're parsed as operators, they explicitly do not have symbolic names like operators do, and they're not treated as operators internally (eg. in Zir they use a special bool_br field rather than pl_node with a Bin payload like the binary operators do).
There have been issues about renaming them to && and || in the past, and the conclusion has been "they're named this way to distinguish them from operators, which don't do control flow"
To summarize, I believe these are proposed solutions so far:
- Make
andandornot short-circuit on Vectors. - Make them short-circuit in one way or another.
- Make
|and&and~/!work on bools and bool vectors. - Change the documentation to reflect that
andandorare not operators.
The semantic effect (haven't checked any codegen) of non-short-circuiting and and or on vectors should be achievable using @select:
const a = @Vector(4, bool){ true, false, true, false };
const b = @Vector(4, bool){ true, true, false, false};
const a_or_b = @select(bool, a, a, b);
const a_and_b = @select(bool, a, b, a);
However, this doesn't really seem to clearly communicate the intent to the compiler or reader to me, and takes a bit to understand especially with something like this
const a = @Vector(4, bool){ true, false, true, false };
const b = @Vector(4, bool){ true, true, false, false};
const any_true = @reduce(.Or, @select(bool, a, a, b)); // @reduce(.Or, a or b)
const all_true = @reduce(.And, @select(bool, a, b, a)); // @reduce(.And, a and b)
const all_true_per_lane = @reduce(.And, @select(bool, a, a, b)); // @reduce(.And, a or b)
const one_lane_both_true = @reduce(.Or, @select(bool, a, a, b)); // @reduce(.Or, a or b)
I wonder if the discussion around short-circuiting is obscuring the importance of clarity and consistency between operators and control flow keywords with operands?
The docs say "Vectors support the same builtin operators as their underlying base types." This is an excellent goal. [ and or not] with booleans, [ & | ~ ] with bits: words go with words, symbols with symbols, as it were. To this end, vector booleans ought to work with the logical control flow keywords only, whereas vector u1 ought to work with the bitwise operators only.
At present, tested on 0.12.0 and 0.13.0, neither the or control flow keyword, nor the | operator, work on boolean vectors.
A set of examples for reference:
// BITS: bitwise operators only; this is good
const b1: u1 = 1;
const b2: u1 = 0;
const together = b1 or b2; // error: expected type 'bool', found 'u1'
const together = b1 | b2; // works
const vec1 = @Vector(2, u1){ 0,0,0,1,0,0,0,1 };
const vec2 = @Vector(2, u1){ 0,1,0,0,0,1,0,0 };
const together = vec1 or vec2; // error: expected type 'bool', found '@Vector(2, u1)'
const together = vec1 | vec2; // works
// BOOLEANS: should work with the _logical control flow keywords [ and or not ] only_
const b1: bool = true;
const b2: bool = false;
const together = b1 or b2; // works
const together = b1 | b2; // error: invalid operands to binary bitwise expression: 'Bool' and 'Bool'
const vec1 = @Vector(2, bool){ false, false};
const vec2 = @Vector(2, bool){ false, true};
const together = vec1 or vec2; // error: expected type 'bool', found '@Vector(2, bool)' <-- THIS SHOULD WORK
const together = vec1 | vec2; // error: invalid operands to binary bitwise expression: 'Vector' and 'Vector' <-- does not and should not work
The discussion of which operators short-circuit versus which don't should take place within the above constraints, to keep their use consistent. So:
- [
andornot] -> work on booleans and vector booleans - [
&|~] -> work on bits and vector bits
If it is decided that the above consistency should be achieved, then this decision can guide how short circuiting is generalized, in a separate discussion. Something to the effect of:
- [
andornot&|] -> short-circuit - [
?????] -> full evaluation
If the above proposal is not preferred, then the situation looks like so:
- [
andornot&|] -> work on booleans, vector bools, bits, and vector bits; short-circuits - [
?????] -> work on booleans, vector bools, bits, and vector bits; do Not short circuit
The inherent problem with and potentially short-circuiting with vectors of bool is that there is no machine instruction for doing that, and if it lacks a hardware equivalent, it's probably not worth having as an operation on vectors.
Here's a quick implementation of non-short-circuiting and, or, and ´!´ for bool-vectors using u8-vectors, notice the clean assembly on x86 (codegen for u1 is shit for some reason)
vand_256:
vmovaps ymm0, ymmword ptr [rdx]
vandps ymm0, ymm0, ymmword ptr [rsi]
vmovaps ymmword ptr [rdi], ymm0
vzeroupper
ret
vor_256:
vmovaps ymm0, ymmword ptr [rdx]
vorps ymm0, ymm0, ymmword ptr [rsi]
vmovaps ymmword ptr [rdi], ymm0
vzeroupper
ret
vnot_256:
vpcmpeqd ymm0, ymm0, ymm0
vpxor ymm0, ymm0, ymmword ptr [rsi]
vmovdqa ymmword ptr [rdi], ymm0
vzeroupper
ret
vand_128:
vmovaps xmm0, xmmword ptr [rdx]
vandps xmm0, xmm0, xmmword ptr [rsi]
vmovaps xmmword ptr [rdi], xmm0
ret
vor_128:
vmovaps xmm0, xmmword ptr [rdx]
vorps xmm0, xmm0, xmmword ptr [rsi]
vmovaps xmmword ptr [rdi], xmm0
ret
vnot_128:
vpcmpeqd xmm0, xmm0, xmm0
vpxor xmm0, xmm0, xmmword ptr [rsi]
vmovdqa xmmword ptr [rdi], xmm0
ret
vand_64:
mov rax, qword ptr [rdx]
and rax, qword ptr [rsi]
mov qword ptr [rdi], rax
ret
vor_64:
mov rax, qword ptr [rdx]
or rax, qword ptr [rsi]
mov qword ptr [rdi], rax
ret
vnot_64:
mov rax, qword ptr [rsi]
not rax
mov qword ptr [rdi], rax
ret
vand_32:
mov eax, dword ptr [rdx]
and eax, dword ptr [rsi]
mov dword ptr [rdi], eax
ret
vor_32:
mov eax, dword ptr [rdx]
or eax, dword ptr [rsi]
mov dword ptr [rdi], eax
ret
vnot_32:
mov eax, dword ptr [rsi]
not eax
mov dword ptr [rdi], eax
ret
source
const std = @import("std");
// FIXME: Vectors of u1 are horrendously unoptimized, so we use
// vectors of u8, which should produce equivalent results
fn Amogus(comptime n: comptime_int) type {
return struct {
fn vorInternal(c: *@Vector(n / 8, u8), a: *const @Vector(n / 8, u8), b: *const @Vector(n / 8, u8)) void {
c.* = b.* | a.*;
}
fn vor(c: *@Vector(n, bool), a: *const @Vector(n, bool), b: *const @Vector(n, bool)) callconv(.C) void {
vorInternal(@ptrCast(c), @ptrCast(a), @ptrCast(b));
}
fn vandInternal(c: *@Vector(n / 8, u8), a: *const @Vector(n / 8, u8), b: *const @Vector(n / 8, u8)) void {
c.* = b.* & a.*;
}
fn vand(c: *@Vector(n, bool), a: *const @Vector(n, bool), b: *const @Vector(n, bool)) callconv(.C) void {
vandInternal(@ptrCast(c), @ptrCast(a), @ptrCast(b));
}
fn vnotInternal(c: *@Vector(n / 8, u8), a: *const @Vector(n / 8, u8)) void {
c.* = ~a.*;
}
fn vnot(c: *@Vector(n, bool), a: *const @Vector(n, bool)) callconv(.C) void {
vnotInternal(@ptrCast(c), @ptrCast(a));
}
comptime {
if (@sizeOf(@Vector(n / 8, u8)) != @sizeOf(@Vector(n, bool))) @compileError("shit");
@export(&vand, .{ .name = std.fmt.comptimePrint("vand_{d}", .{n}) });
@export(&vor, .{ .name = std.fmt.comptimePrint("vor_{d}", .{n}) });
@export(&vnot, .{ .name = std.fmt.comptimePrint("vnot_{d}", .{n}) });
}
};
}
comptime {
_ = Amogus(256);
_ = Amogus(128);
_ = Amogus(64);
_ = Amogus(32);
}
Why is there even a discussion on whether and and orshould short circuit? A vector of bool is not a bool in the same way an array of bool is not a bool. The only thing I expect to short circuit is boolean expressions, a vector of bool isn't a bool and evaluating v and u using vectors requires all elements to be evaluated anyway so what is even gained by short circuiting and it what scenarios would it even have an effect?
A complete solution might be:
- reactivate
&and|for bools, explicitly issue guidance that these are non-short-circuiting - allow use of "bitwise" operators for vector(bool). Note this is consistent With what is happening under the hood.
- disallow
andandoron vector bool (no change) - status quo for
andandoroperations on scalar bool - remove
andandorfrom official list of operators in the docs (or put them at the end with a special note)
Seeing "==" return an @Vector(4, bool) sure shocked me when I hit it coming from GLSL. HLSL and WGSL seem to operate the way Zig does, though.
WGSL only allows short circuiting on "||" or "&&" with scalar bools. HLSL never allows short-circuiting. Metal seems to be component-wise without short circuiting (at least not obviously mentioned in the spec but I'm not a Metal guy). GLSL appears to be odd man out with just being very weird.
I suspect that "and"/"or" should only operate on a "bool" with the "bitwise" operators operating on bool vectors.
The only wrinkle that I can think of is that there are "any" and "all" type of intrinsics for bool vectors to collapse them to bool scalars and I'm not sure how those would fit here.
Just ran into this issue while writing some vectorized code. I was a little shocked it didn't work at first, but eventually realized the problem with short-circuiting. I just want to pitch my 2 cents in with enabling bitwise operators on vectors of bools - after all, bit-fields can be thought of as "vectors" of boolean values.
I'm not jumping up and down about short-circuiting, but I'm kinda leaning the other way on "==" and "!=" in that Zig should collapse the vectors to a scalar boolean.
The problem is that you wind up writing all(foo == bar) and any(foo != bar) a lot. That gives you two opportunities to screw up if you want to change the condition. any(foo == bar) is rarely what you want and all(foo != bar) is rarely what you want, but if you flip the comparison and forget to flip the collector, you get fairly pernicious runtime bugs.
I'd like to build on what @Trevor-Strong said to hopefully make it clear that short circuiting doesn't really mean anything meaningful here. The short circuiting feature of logical operators is only meaningful in the context of boolean expressions, which may have side effects, or which may be computationally expensive. For example:
if (my_bool and expensiveFunction()) {
// ...
}
will avoid evaluating expensiveFunction() entirely if my_bool is false, because we already know that the compound expression will be false. Consider this alternative:
const a = my_bool;
const b = expensiveFunction();
if (a and b) {
// ...
}
This is logically equivalent to the first example; the body of the if will only be evaluated if both my_bool and expensiveFunction() are true in both cases. In the second one both operands are eagerly evaluated, and thus no computation is avoided. This would be useful to do if expensiveFunction() had some desirable side effect that we wanted regardless of if my_bool were true or not.
Vectors in Zig contain values, not compound expressions. A logical expression like c[0] and d[0] will never evaluate a function call like expensiveFunction(). In order for the value to be in the array or vector so it can be accessed by index, it must be computed and stored in the array or vector, just like how a and b were declared before prior to being accessed.
The least surprising behavior to me, knowing how vectors of integers behave in Zig with arithmetic operators, is for logical operators to operate element-wise resulting in a new vector of booleans. .{ true, false } and .{ true, true } should evaluate to .{ true, false }. Requiring bitwise operators for logical expressions would be surprising to me, because my intent is to do logical operations on boolean values, not bit fiddling.
and and or will continue to not work for vectors of booleans because they are keywords indicating that they affect control flow, which is not the case for vectors.
Please see #24093 for the accepted syntax proposal to accomplish these operations.
despite the manual claiming otherwise
Sure you could call them operators, but you could also call them syntactical constructs making them exempt from such parlance. But it's not worth debating - the langref can be adjusted to gracefully avoid ambiguity either way.