xla issues

[ROCm] Add PJRT_Triton_Extension support

1

📝 Summary of Changes This change is PJRT_Triton_Extension support for ROCm as counterpart of that for CUDA. Pallas Triton calls are lowered to HSACO directly rather than PTX on ROCm...

amd-jianli12

Reverts f4d835b4b9b734953bd9eac84b8aa5358f9f6ffa

copybara-service[bot]

[ROCm] Add missing run_under to tsan/asan configs, add missing tsan ignorelist

📝 Summary of Changes Properly support asan/tsan builds with rbe by providing the lists through the run_under script 🎯 Justification asan and tsan configs were missing the run_under wrapper so...

alekstheod

[XLA:GPU] Make flop_per_ns_per_fpu a double in CalculateEffectiveFlopsPerNs

[XLA:GPU] Make flop_per_ns_per_fpu a double in CalculateEffectiveFlopsPerNs Otherwise we were undercounting effective flops. Example for H100 at full occupancy: fpu_count = n_active_core * n_active_fpus_per_core; // 132 * 128 = 16896...

copybara-service[bot]

PR #30855: [ROCM] CommandBuffer support for CollectivePermute op

PR #30855: [ROCM] CommandBuffer support for CollectivePermute op Imported from GitHub PR https://github.com/openxla/xla/pull/30855 📝 Summary of Changes - added CommandBuffer support for CollectivePermute op 🎯 Justification These ops were missing...

copybara-service[bot]

[XLA:GPU] Add more informative error messages to CHECKs in GpuPerformanceModel.

copybara-service[bot]

[ROCM] CommandBuffer support for CollectivePermute op

3

📝 Summary of Changes - added CommandBuffer support for CollectivePermute op 🎯 Justification These ops were missing for whatever reason: this results in graph fragmentation especially for large models. Hence...

pemeliya

Use `absl::StrAppend` for string concatenation.

Use `absl::StrAppend` for string concatenation. Replace `+= absl::StrCat` with `absl::StrAppend` for more efficient string appending.

copybara-service[bot]

[ROCm] Fix rocm_executor_test

📝 Summary of Changes After this change to `GpuComputeCapability` https://github.com/openxla/xla/commit/11b6a3db362f30b79d385f24523d184592132f11 `RocmExecutorTest:CreateDeviceDescription` had a build break. 🚀 Kind of Contribution Please remove what does not apply: 🐛 Bug Fix

mmakevic-amd

[XLA:GPU] use is IsTritonSupportedDataType from new support checks

copybara-service[bot]

xla
xla copied to clipboard

Metadata

[ROCm] Add PJRT_Triton_Extension support

Reverts f4d835b4b9b734953bd9eac84b8aa5358f9f6ffa

[ROCm] Add missing run_under to tsan/asan configs, add missing tsan ignorelist

[XLA:GPU] Make flop_per_ns_per_fpu a double in CalculateEffectiveFlopsPerNs

PR #30855: [ROCM] CommandBuffer support for CollectivePermute op

[XLA:GPU] Add more informative error messages to CHECKs in GpuPerformanceModel.

[ROCM] CommandBuffer support for CollectivePermute op

Use `absl::StrAppend` for string concatenation.

[ROCm] Fix rocm_executor_test

[XLA:GPU] use is IsTritonSupportedDataType from new support checks

← Metadata

Owner

Metadata

xla xla copied to clipboard

Metadata

← Metadata

Owner

Metadata

xla
xla copied to clipboard