zig icon indicating copy to clipboard operation
zig copied to clipboard

`and` cannot operate on vectors of bool despite the manual claiming otherwise

Open notcancername opened this issue 2 years ago • 16 comments

Zig Version

0.11.0-dev.1300+7cb2f9222

Steps to Reproduce and Observed Behavior

zig test this code:

test {
    const bv = @Vector(4, bool){false, true, false, true};
    const ma = @Vector(4, bool){true, true, false, false};
    _ = bv and ma;
}

An error is output:

error: expected type 'bool', found '@Vector(4, bool)'
    _ = bv and ma;
        ^~

Expected Behavior

The manual says that

Vectors support the same builtin operators as their underlying base types.

This isn't the case for and. Based on the manual, I expected that bv and ma would evaluate to @Vector(4, bool){false, true, false, false}.

The same applies to or and !.

notcancername avatar Jan 13 '23 21:01 notcancername

I'm surprised to see this relabeled from documentation issue to implementation bug, so I'll write down my perspective/interpretation to maybe start the discussion of what the desired resolution should be.

Background

First of all, there are two variants of the mathematical operations named here in Zig:

  • defined on bits ("bitwise"): ~, |, &
  • defined on booleans: !, or, and

The unary operations are less interesting, while ~ is defined to flip every bit in an integer, for a boolean it is defined to exchange true and false. While this seems identical if a boolean is viewed to have a single backing bit, in hardware they might be larger (say an entire byte or register), giving more freedom in how the two states are represented, which is why the compiler might choose to emit different instructions for each.

The binary operations have a more important, fundamental distinction, in that or and and are short-circuiting. That means that the right-hand operand is only evaluated (its calculation executed) if the result isn't dictated by the left-hand operand:

x==true x==false
x and f() f() is executed f() not executed (always results in false)
x or f() f() not executed (always results in true) f() is executed

(Note: This leads to the neat consistency that Zig's keyword operators (if, catch, etc.) influence control flow, while symbol operators (+, !, etc.) don't.)

Now Zig defines operations on vectors to generally work element-wise. So a = b + c with 4-element vectors results in a vector holding the sum of each element, so a[i] == b[i] + c[i]. Vector types are meant to express SIMD (Single Instruction (with/on) Multiple Data) operations, and the compiler will try to lower them using appropriate instructions (where reasonably efficient). However, while SIMD instruction sets have traditionally been good at expressing parallel operations, expressing "element-wise optional" operations (f.e. by masking some elements) is more complex. Some instruction sets give more tools for this, and some architectures and implementations are more efficient at this, than others.


Now the crux is this: To generalize short-circuiting operations and and or to vector elements, we need to decide on their impact on control flow. I see 3 options for this (using example snippet bv and f()):

  1. (easy, but imo less regular) Make them non-short-circuit. Always fully evaluate the right-hand side f(), then do the binary operation.
  2. (a litte more costly to implement) Make them short-circuit only when the right-hand side f() would be fully discarded. This means we would need to decide control flow based on whether left-hand side bv is fully true for or, and fully false for and.
  3. (very complex, but imo the most regular) Make them element-wise short-circuit. Emit a compiler error where this is impossible, either due to theoretical limitations (ambiguity) or practical limitations in the compiler (f.e. external code).

Currently the operation isn't implemented, and afaik what Zig should choose to do hasn't been decided yet. While 3 would be really neat for aesthetic reasons (i.e. sense of accomplishment), and although there might be advantages in certain scenarios, implementing that behavior would surely be a technical challenge (and I won't go into even more detail unprovoked).

I personally dislike option 1 for the reason that, if we allowed bitwise operators ~, |, and & on bool values with non-surprising semantics, then they would be exactly equivalent to option 1 above. Having both types of operators map to the same operation here, on top of maybe being surprising, nets us no benefit over just not supporting or and and on vectors, which is why I'd personally prefer to see option 2 AND bitwise operations on bool values implemented.

rohlem avatar Jan 14 '23 01:01 rohlem

I think Option 1 Make them non-short-circuit is the best option.

I personally think of vector instructions like this:

z = x() + y();
// is conceptually equivalent to doing this:
var temp1 = x();
var temp2 = y();
z[0] = temp1[0] + temp2[0];
z[1] = temp1[1] + temp2[1];
...

When thinking like this the and operator automatically becomes non-short-circuit:

z = x() and y();
// is conceptually equivalent to doing this:
var temp1 = x();
var temp2 = y();
z[0] = temp1[0] and temp2[0];
z[1] = temp1[1] and temp2[1];
...

So I would be really surprised if in a small number of cases y() wasn't evaluated. ​ ​

While Option 2 doesn't fit my mental model, maybe it's at least a useful optimization?

Let's consider the code x or f(y) How likely is it that a single bool x is true? ​ ​ ​ ​ ​ ​50% → we need to evaluate f(y) 50% of the time. How likely is it that a @Vector(8, bool) x is true in every field? ​ ​ ​ ​ ​ 6% → we need to evaluate f(y) 94 % of the time. This is not worth in my opinion. Especially considering that on arm @reduce() takes a lot of operations. So for small f(y) short-circuiting would be significantly slower on arm. While on the other hand even for the biggest possible f(y) we can't get more than a 6% improvement on average.

And I don't think we need to consider option 3. It defeats the point of vector operations and it would be super complicated to implement.

IntegratedQuantum avatar Jan 14 '23 10:01 IntegratedQuantum

The manual says that

Vectors support the same builtin operators as their underlying base types.

and and or are not operators, they're control flow keywords.

Making them not short circuit on vectors would be inconsistent with how they normally work, which is very confusing. As for having them do partial short circuiting, that's a potentially-expensive, hidden operation, which is very much in opposition to Zig's design.

My suggestion would be to make & and | work on bools and vectors of bools as non-short-circuiting operations, and keep and and or only for single bools.

silversquirl avatar Jan 14 '23 13:01 silversquirl

and and or are not operators, they're control flow keywords.

and and or are listed in the table of operators, so I would assume that they are operators.

IntegratedQuantum avatar Jan 14 '23 14:01 IntegratedQuantum

While they're parsed as operators, they explicitly do not have symbolic names like operators do, and they're not treated as operators internally (eg. in Zir they use a special bool_br field rather than pl_node with a Bin payload like the binary operators do).

There have been issues about renaming them to && and || in the past, and the conclusion has been "they're named this way to distinguish them from operators, which don't do control flow"

silversquirl avatar Jan 14 '23 14:01 silversquirl

To summarize, I believe these are proposed solutions so far:

  • Make and and or not short-circuit on Vectors.
  • Make them short-circuit in one way or another.
  • Make | and & and ~/! work on bools and bool vectors.
  • Change the documentation to reflect that and and or are not operators.

notcancername avatar Jan 16 '23 00:01 notcancername

The semantic effect (haven't checked any codegen) of non-short-circuiting and and or on vectors should be achievable using @select:

const a = @Vector(4, bool){ true, false, true, false };
const  b = @Vector(4, bool){ true, true, false, false};
const a_or_b = @select(bool, a, a, b);
const a_and_b = @select(bool, a, b, a);

However, this doesn't really seem to clearly communicate the intent to the compiler or reader to me, and takes a bit to understand especially with something like this

const a = @Vector(4, bool){ true, false, true, false };
const b = @Vector(4, bool){ true, true, false, false};
const any_true = @reduce(.Or, @select(bool, a, a, b));           // @reduce(.Or, a or b)
const all_true = @reduce(.And, @select(bool, a, b, a));          // @reduce(.And, a and b)
const all_true_per_lane = @reduce(.And, @select(bool, a, a, b)); // @reduce(.And, a or b)
const one_lane_both_true = @reduce(.Or, @select(bool, a, a, b)); // @reduce(.Or, a or b)

dweiller avatar Jul 08 '23 07:07 dweiller

I wonder if the discussion around short-circuiting is obscuring the importance of clarity and consistency between operators and control flow keywords with operands?

The docs say "Vectors support the same builtin operators as their underlying base types." This is an excellent goal. [ and or not] with booleans, [ & | ~ ] with bits: words go with words, symbols with symbols, as it were. To this end, vector booleans ought to work with the logical control flow keywords only, whereas vector u1 ought to work with the bitwise operators only.

At present, tested on 0.12.0 and 0.13.0, neither the or control flow keyword, nor the | operator, work on boolean vectors.

A set of examples for reference:

// BITS: bitwise operators only; this is good

const b1: u1 = 1;
const b2: u1 = 0;
const together = b1 or b2;  // error: expected type 'bool', found 'u1'
const together = b1 | b2;  // works

const vec1 = @Vector(2, u1){ 0,0,0,1,0,0,0,1 };
const vec2 = @Vector(2, u1){ 0,1,0,0,0,1,0,0 };
const together = vec1 or vec2;  // error: expected type 'bool', found '@Vector(2, u1)'
const together = vec1 | vec2;  // works


// BOOLEANS: should work with the _logical control flow keywords [ and or not ] only_

const b1: bool = true;
const b2: bool = false;
const together = b1 or b2;  // works
const together = b1 | b2;  // error: invalid operands to binary bitwise expression: 'Bool' and 'Bool'

const vec1 = @Vector(2, bool){ false, false};
const vec2 = @Vector(2, bool){ false, true};
const together = vec1 or vec2;  // error: expected type 'bool', found '@Vector(2, bool)'  <-- THIS SHOULD WORK
const together = vec1 | vec2;  // error: invalid operands to binary bitwise expression: 'Vector' and 'Vector'  <-- does not and should not work

The discussion of which operators short-circuit versus which don't should take place within the above constraints, to keep their use consistent. So:

  • [ and or not ] -> work on booleans and vector booleans
  • [ & | ~ ] -> work on bits and vector bits

If it is decided that the above consistency should be achieved, then this decision can guide how short circuiting is generalized, in a separate discussion. Something to the effect of:

  • [ and or not & | ] -> short-circuit
  • [ ? ? ? ? ? ] -> full evaluation

If the above proposal is not preferred, then the situation looks like so:

  • [ and or not & | ] -> work on booleans, vector bools, bits, and vector bits; short-circuits
  • [ ? ? ? ? ? ] -> work on booleans, vector bools, bits, and vector bits; do Not short circuit

skytwosea avatar Sep 17 '24 04:09 skytwosea

The inherent problem with and potentially short-circuiting with vectors of bool is that there is no machine instruction for doing that, and if it lacks a hardware equivalent, it's probably not worth having as an operation on vectors.

notcancername avatar Sep 17 '24 06:09 notcancername

Here's a quick implementation of non-short-circuiting and, or, and ´!´ for bool-vectors using u8-vectors, notice the clean assembly on x86 (codegen for u1 is shit for some reason)

vand_256:
	vmovaps	ymm0, ymmword ptr [rdx]
	vandps	ymm0, ymm0, ymmword ptr [rsi]
	vmovaps	ymmword ptr [rdi], ymm0
	vzeroupper
	ret

vor_256:
	vmovaps	ymm0, ymmword ptr [rdx]
	vorps	ymm0, ymm0, ymmword ptr [rsi]
	vmovaps	ymmword ptr [rdi], ymm0
	vzeroupper
	ret

vnot_256:
	vpcmpeqd	ymm0, ymm0, ymm0
	vpxor	ymm0, ymm0, ymmword ptr [rsi]
	vmovdqa	ymmword ptr [rdi], ymm0
	vzeroupper
	ret

vand_128:
	vmovaps	xmm0, xmmword ptr [rdx]
	vandps	xmm0, xmm0, xmmword ptr [rsi]
	vmovaps	xmmword ptr [rdi], xmm0
	ret

vor_128:
	vmovaps	xmm0, xmmword ptr [rdx]
	vorps	xmm0, xmm0, xmmword ptr [rsi]
	vmovaps	xmmword ptr [rdi], xmm0
	ret

vnot_128:
	vpcmpeqd	xmm0, xmm0, xmm0
	vpxor	xmm0, xmm0, xmmword ptr [rsi]
	vmovdqa	xmmword ptr [rdi], xmm0
	ret

vand_64:
	mov	rax, qword ptr [rdx]
	and	rax, qword ptr [rsi]
	mov	qword ptr [rdi], rax
	ret

vor_64:
	mov	rax, qword ptr [rdx]
	or	rax, qword ptr [rsi]
	mov	qword ptr [rdi], rax
	ret

vnot_64:
	mov	rax, qword ptr [rsi]
	not	rax
	mov	qword ptr [rdi], rax
	ret

vand_32:
	mov	eax, dword ptr [rdx]
	and	eax, dword ptr [rsi]
	mov	dword ptr [rdi], eax
	ret

vor_32:
	mov	eax, dword ptr [rdx]
	or	eax, dword ptr [rsi]
	mov	dword ptr [rdi], eax
	ret

vnot_32:
	mov	eax, dword ptr [rsi]
	not	eax
	mov	dword ptr [rdi], eax
	ret
source
const std = @import("std");

// FIXME: Vectors of u1 are horrendously unoptimized, so we use
// vectors of u8, which should produce equivalent results
fn Amogus(comptime n: comptime_int) type {
    return struct {
        fn vorInternal(c: *@Vector(n / 8, u8), a: *const @Vector(n / 8, u8), b: *const @Vector(n / 8, u8)) void {
            c.* = b.* | a.*;
        }

        fn vor(c: *@Vector(n, bool), a: *const @Vector(n, bool), b: *const @Vector(n, bool)) callconv(.C) void {
            vorInternal(@ptrCast(c), @ptrCast(a), @ptrCast(b));
        }

        fn vandInternal(c: *@Vector(n / 8, u8), a: *const @Vector(n / 8, u8), b: *const @Vector(n / 8, u8)) void {
            c.* = b.* & a.*;
        }

        fn vand(c: *@Vector(n, bool), a: *const @Vector(n, bool), b: *const @Vector(n, bool)) callconv(.C) void {
            vandInternal(@ptrCast(c), @ptrCast(a), @ptrCast(b));
        }

        fn vnotInternal(c: *@Vector(n / 8, u8), a: *const @Vector(n / 8, u8)) void {
            c.* = ~a.*;
        }

        fn vnot(c: *@Vector(n, bool), a: *const @Vector(n, bool)) callconv(.C) void {
            vnotInternal(@ptrCast(c), @ptrCast(a));
        }

        comptime {
            if (@sizeOf(@Vector(n / 8, u8)) != @sizeOf(@Vector(n, bool))) @compileError("shit");
            @export(&vand, .{ .name = std.fmt.comptimePrint("vand_{d}", .{n}) });
            @export(&vor, .{ .name = std.fmt.comptimePrint("vor_{d}", .{n}) });
            @export(&vnot, .{ .name = std.fmt.comptimePrint("vnot_{d}", .{n}) });
        }
    };
}

comptime {
    _ = Amogus(256);
    _ = Amogus(128);
    _ = Amogus(64);
    _ = Amogus(32);
}

notcancername avatar Sep 17 '24 07:09 notcancername

Why is there even a discussion on whether and and orshould short circuit? A vector of bool is not a bool in the same way an array of bool is not a bool. The only thing I expect to short circuit is boolean expressions, a vector of bool isn't a bool and evaluating v and u using vectors requires all elements to be evaluated anyway so what is even gained by short circuiting and it what scenarios would it even have an effect?

Trevor-Strong avatar Sep 17 '24 21:09 Trevor-Strong

A complete solution might be:

  • reactivate & and | for bools, explicitly issue guidance that these are non-short-circuiting
  • allow use of "bitwise" operators for vector(bool). Note this is consistent With what is happening under the hood.
  • disallow and and or on vector bool (no change)
  • status quo for and and or operations on scalar bool
  • remove and and or from official list of operators in the docs (or put them at the end with a special note)

ityonemo avatar Sep 18 '24 06:09 ityonemo

Seeing "==" return an @Vector(4, bool) sure shocked me when I hit it coming from GLSL. HLSL and WGSL seem to operate the way Zig does, though.

WGSL only allows short circuiting on "||" or "&&" with scalar bools. HLSL never allows short-circuiting. Metal seems to be component-wise without short circuiting (at least not obviously mentioned in the spec but I'm not a Metal guy). GLSL appears to be odd man out with just being very weird.

I suspect that "and"/"or" should only operate on a "bool" with the "bitwise" operators operating on bool vectors.

The only wrinkle that I can think of is that there are "any" and "all" type of intrinsics for bool vectors to collapse them to bool scalars and I'm not sure how those would fit here.

buzmeg avatar Oct 11 '24 00:10 buzmeg

Just ran into this issue while writing some vectorized code. I was a little shocked it didn't work at first, but eventually realized the problem with short-circuiting. I just want to pitch my 2 cents in with enabling bitwise operators on vectors of bools - after all, bit-fields can be thought of as "vectors" of boolean values.

desttinghim avatar Nov 27 '24 04:11 desttinghim

I'm not jumping up and down about short-circuiting, but I'm kinda leaning the other way on "==" and "!=" in that Zig should collapse the vectors to a scalar boolean.

The problem is that you wind up writing all(foo == bar) and any(foo != bar) a lot. That gives you two opportunities to screw up if you want to change the condition. any(foo == bar) is rarely what you want and all(foo != bar) is rarely what you want, but if you flip the comparison and forget to flip the collector, you get fairly pernicious runtime bugs.

buzmeg avatar Nov 27 '24 04:11 buzmeg

I'd like to build on what @Trevor-Strong said to hopefully make it clear that short circuiting doesn't really mean anything meaningful here. The short circuiting feature of logical operators is only meaningful in the context of boolean expressions, which may have side effects, or which may be computationally expensive. For example:

if (my_bool and expensiveFunction()) {
    // ...
}

will avoid evaluating expensiveFunction() entirely if my_bool is false, because we already know that the compound expression will be false. Consider this alternative:

const a = my_bool;
const b = expensiveFunction();

if (a and b) {
    // ...
}

This is logically equivalent to the first example; the body of the if will only be evaluated if both my_bool and expensiveFunction() are true in both cases. In the second one both operands are eagerly evaluated, and thus no computation is avoided. This would be useful to do if expensiveFunction() had some desirable side effect that we wanted regardless of if my_bool were true or not.

Vectors in Zig contain values, not compound expressions. A logical expression like c[0] and d[0] will never evaluate a function call like expensiveFunction(). In order for the value to be in the array or vector so it can be accessed by index, it must be computed and stored in the array or vector, just like how a and b were declared before prior to being accessed.

The least surprising behavior to me, knowing how vectors of integers behave in Zig with arithmetic operators, is for logical operators to operate element-wise resulting in a new vector of booleans. .{ true, false } and .{ true, true } should evaluate to .{ true, false }. Requiring bitwise operators for logical expressions would be surprising to me, because my intent is to do logical operations on boolean values, not bit fiddling.

Zambito1 avatar May 21 '25 21:05 Zambito1

and and or will continue to not work for vectors of booleans because they are keywords indicating that they affect control flow, which is not the case for vectors.

Please see #24093 for the accepted syntax proposal to accomplish these operations.

despite the manual claiming otherwise

Sure you could call them operators, but you could also call them syntactical constructs making them exempt from such parlance. But it's not worth debating - the langref can be adjusted to gracefully avoid ambiguity either way.

andrewrk avatar Jun 06 '25 03:06 andrewrk