Scott Todd
Scott Todd
The source for `configured_module_prefill_bs4$async_dispatch_1.mlir` is: ```mlir hal.executable public @prefill_bs4$async_dispatch_1 { hal.executable.variant public @vulkan_spirv_fb target() { hal.executable.export public @prefill_bs4$async_dispatch_1_generic_4xDx3200_i64xf32 ordinal(0) layout(#hal.pipeline.layout) attributes {hal.interface.bindings = [#hal.interface.binding, #hal.interface.binding, #hal.interface.binding]} { ^bb0(%arg0: !hal.device, %arg1:...
The `configured_module_prefill_bs4$async_dispatch_1.mlir` issue (using too much shared memory) may go away with https://github.com/llvm/torch-mlir/pull/3277. I still see the `spirv.IAdd` issue with `configured_module_prefill_bs4$async_dispatch_0.mlir`: ```mlir hal.executable public @prefill_bs4$async_dispatch_0 { hal.executable.variant public @vulkan_spirv_fb target()...
Copying from https://github.com/nod-ai/sharktank/issues/22#issuecomment-2099983855: > For spriv-vulkan backend here's the minimal repro > > ```mlir > func.func @torch_add(%arg0: !torch.vtensor, %arg1: !torch.vtensor) -> !torch.vtensor { > %int1 = torch.constant.int 1 > %2...
> ### Version information > Installed via pip and should be the newest release > > ``` > IREE (https://iree.dev): > IREE compiler version 20240410.859 @ b4273a4bfc66ba6dd8f62f6483d74d42a7b936f1 > LLVM version...
> @ScottTodd My PR fixed this error. Maybe you could try #17137 to see if it is also fixed 🙂 Ah, thanks for the note. The test failures on that...
We can also include collectives (multi-gpu) tests on the AMDGPU machines. See [this discord discussion](https://discord.com/channels/689900678990135345/689906000043573354/1235629696511639552). The mi250 and w7900 runners both have 4 GPUs. Enabling multi-gpu testing once we have...
This is sort of done and stable. Still needs a dedicated owner though. I'm auditing some of our test suites now and finding areas that are only tested on CPU/Vulkan/CUDA...
> For some reason, byo_llvm is able to build and link successfully but then segfaults on some of the tests. These tests aren't related to PDLL at all either. Got...
O_O I found a workaround for the failed build: set `-DIREE_ENABLE_LLD=ON` (to link to lld instead of the system linker)
Proposed another solution: https://github.com/iree-org/iree/pull/17493