Ben Vanik
Ben Vanik
Ouch, thanks for pointing out that WGPU_WHOLE_SIZE was not allowed for dynamic bindings - definitely hadn't caught that and was about to build a GPUBindGroupArena-alike assuming it would work. Is...
The issue is when you have thousands of bind groups being created in a futile attempt to let the implementation elide barriers by letting it know there are no hazards...
Interesting! I think that highlights the core issue being that ownership transfer requirement: concurrent access from both devices is what is desired even if controlled by the queue via a...
TLDR: a synchronous try-map - even if restricted to workers - would make things dramatically better and may negate the immediate need for a readBuffer - but a readBuffer would...
Makes sense! We've mostly found the same thing from an execution-time perspective and just treat them as a way to memoize recording. It's nice being able to record a few...
Nit: _you_ don't see this in compute work today - but _we_ do - and we'd love for more of these kind of workloads to work in WebGPU so others...
I'm just getting started with HA and found this while trying to get my SHP integrated - I've got the last release installed via HACS but haven't figured out how...
not a fan of the duplication - I'd say add an iree/base/internal/time.h/.c, put the function in there as `int64_t iree_time_now_ns(void)`, call it from the iree/base/time.c `iree_time_now` and also from the...
Closing now in favor of a shared/multi-device to main merge PR.
it may do different things (as it also changes public ABI), but maybe adding an f32->bf16 to https://github.com/iree-org/iree/blob/main/compiler/src/iree/compiler/InputConversion/Common/ConvertPrimitiveType.cpp#L308 could help? or, if it works but does more than you want...