David Alsh

Results 141 comments of David Alsh

@Matthew-Jenkins, appears to make no difference unfortunately: With (baseline): ``` export PYTORCH_TUNABLEOP_ENABLED=1 export TORCH_ROCM_AOTRITON_ENABLE_EXPERIMENTAL=1 export TORCH_COMMAND="--pre torch torchvision torchaudio pytorch-triton-rocm --index-url https://download.pytorch.org/whl/nightly/rocm6.4" python3.10 ./main.py --use-pytorch-cross-attention ``` |Model|Steps|Resolution|Speed|Time|Notes| |-|-|-|-|-|-| |SDXL|20|1024x1024|1.49it/s|34.56s|| |SDXL|20|1024x1024|1.5it/s|27.76s|Manual...

Thanks for the tips > Try `MIOPEN_FIND_MODE=2 HSA_OVERRIDE_GFX_VERSION=11.0.0` > if that doesn't work then try `MIOPEN_FIND_MODE=2 HSA_OVERRIDE_GFX_VERSION=10.3.0` Just tried both of these, unfortunately no luck there. > Until pytorch updates...

Hey @kasper93, regarding this: > and pytroch install from https://repo.radeon.com/rocm/manylinux/rocm-rel-6.4.1/ it should work "reasonable" Are there docs on how to install PyTorch from here? I have no idea what `pip...

Not directly related but I just tried ComfyUI under Windows + latest HIP + Zluda on my 9070xt and performance is about the same as ROCm under Linux. Generating an...

> So this is also a Windows issue and not necessarily Rocm related?! This occurs both on Windows and Linux. [This thread](https://github.com/ROCm/ROCm/issues/5040) has some interesting insights. It also looks like...

I have just tried installing ROCm 7 beta on Fedora 42 via the RHEL repos and retried generating a 1024x1024 image with SDXL on ComfyUI. Relatively easy to install: https://www.youtube.com/watch?v=7qDlHpeTmC0...

I'm also interested in Vulkan benchmarks too. I've seen some LLM perform incredibly well on the 9070xt under Vulkan which gives me hope. I've been trying to compile Pytorch to...

What makes matters worse is my experiments with ROCm 7 doesn't seem to improve performance on my 9070xt. So much potential in these cards to kick ass but it doesn't...

I just tested official ROCm 7.0 with the ROCm fork of Pytorch - Ubuntu 24.04 - Python 3.12.11 - ROCm 7.0 ([Installed from AMD](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/quick-start.html#rocm-installation)) - Pytorch 2.8.0 ([Installed from AMD](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/quick-start.html#rocm-installation))...

20 steps, same workflow/config as the other tests. I tried to get the MiGraphX node working but had no luck unfortunately. 90% of the time is spent in VAE decoding,...