Myles C. Maxfield

Results 283 comments of Myles C. Maxfield

Options: 1. What [MSL](https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf#//apple_ref/doc/uid/TP40014364), [HLSL](https://docs.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl-vector), [CUDA](https://developer.nvidia.com/blog/cuda-pro-tip-increase-performance-with-vectorized-memory-access/), and [`simd/simd.h`](https://developer.apple.com/documentation/accelerate/simd) have: `{bool,int,uint,half,float}{2,3,4}` 2. What [GLSL has](https://www.khronos.org/registry/OpenGL/specs/gl/GLSLangSpec.4.60.html#basic-types): `{vec,bvec,ivec,uvec}{2,3,4}` 3. What [Intel assembly has](https://docs.microsoft.com/en-us/cpp/cpp/m128?view=msvc-160): `__m128`, `__m128i` 4. What [ARM NEON assembly has](https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/neon-programmers-guide-for-armv8-a/optimizing-c-code-with-neon-intrinsics/program-conventions): `{int32,uint32,float32}x{2,3,4}_t` 5....

Given the popularity of option 1, I think that one is the clear winner.

Note: I'm _not_ proposing to add a `trap` statement to the language. I'm just proposing that the behavior be legal in some circumstances.

This proposal has no effect on WGSL’s uniformity rules. The rules would not change. In the case where the array index is nonuniform, this trapping behavior would either: A. Only...

> How would I cause this behaviour to occur in MSL? There is no built-in intrinsic for this in Metal. This would be achieved by emitting standard normal Metal code...

> If the performance win is there, why not > > * implement the slowpath/fastpath check at the top of some section of code > * then branch to fast...

Sounds like we're pretty much resolved, modulo specifics (@kainino0x?) - there's general agreement on the call today to add a new error type.

> For others (sqrt, inverseSqrt, log, log2) consider deviating from JS and WASM, to preserve speed. We'd prefer to start off with portable behavior (which can be changed in the...

> In options 1 and 2, the following `GPUSampler` state would be obeyed: > > * `magFilter`, `minFilter` to provide control over linear vs nearest > * `maxAnisotropy` probably, for...

Given today's call, I think I should probably add some clarifying descriptions. In it, I proposed option 4 as a compromise. First, I mentioned in the call that we have...