burn icon indicating copy to clipboard operation
burn copied to clipboard

Crash on Chrome WebGPU for kernels that bind with aliasing.

Open ArthurBrussee opened this issue 1 year ago • 2 comments

Describe the bug

If some operations use a tensor multiple times, Chrome can crash complaining it's a violation of the WebGPU specification to bind aliased buffers.

To Reproduce

let tensor = Tensor::<Backend, 2>::zeros([2, 1], &device); 
let test: Tensor<Backend, 2> = tensor.clone().mul(tensor).sum_dim(1); // fails!
//  for some reason works if tensor is 1D? Or if not summing / presumable some other operation?
// It might be slightly more involved than I thought!
// The following does work:
// let test: Tensor<Backend, 2> = tensor.clone().powf_scalar(2.0).sum_dim(1);

Run this code on WebGPU on Chrome (firefox nightly seems fine with it - though I've had a host of other issues on firefox so generally it seems less complete).

The output looks like:

Writable storage buffer binding aliasing found between [BindGroup (unlabeled)] set at bind group index 0, binding index 0, and [BindGroup (unlabeled)] set at bind group index 0, binding index 1, with overlapping ranges (offset: 0, size: 8) and (offset: 0, size: 8) in [Buffer (unlabeled)].
 - While encoding [ComputePassEncoder (unlabeled)].DispatchWorkgroups(1, 1, 1).

127.0.0.1/:1 [Invalid CommandBuffer "Command Encoder" from CommandEncoder "Command Encoder"] is invalid.
 - While calling [Queue].Submit([[Invalid CommandBuffer "Command Encoder" from CommandEncoder "Command Encoder"]])

Expected behavior Don't crash :) How to achieve that seems trickier. You'd need different WGSL versions of the kernel depending on whether inputs/outputs are the same or not. CubeCL might be able to handle that, but I'm not entirey sure what 's best here!

ArthurBrussee avatar Jun 25 '24 22:06 ArthurBrussee

CC @louisfd @nathanielsimard

antimora avatar Jun 28 '24 16:06 antimora

Not sure if there is an option to disable that check, if not we will have to compile more kernels automatically based on handle ids I guess.

nathanielsimard avatar Jul 01 '24 14:07 nathanielsimard

@ArthurBrussee checking back. Do we still have this issue?

antimora avatar May 06 '25 18:05 antimora

No think this is fixed!

ArthurBrussee avatar May 06 '25 19:05 ArthurBrussee