xla
xla copied to clipboard
A machine learning compiler for GPUs, CPUs, and ML accelerators
[pjrt] Add initial APIs to create and destroy PJRT_ExecuteContext Registering user-defined types with FFI and passing user data to FFI handlers via execute context coming in followup PRs.
[XLA:CPU] Remove unnecessary include of absl/log/check.h. This resolves naming conflicts with tsl and absl.
[Multi-host GPU]Add a utility function to build GpuTopologyProto from GlobalTopologyProto.
Make StreamExecutorGpuTopologyDescription compile without error
[pjrt] Add an API to add user data to FFI context via PJRT_ExecuteContext
For current column reduction codegen, sm core active ratio is low if the last kept dimension is small, can see the below hlo: fused_reduce { param_1.15 = bf16[1,2048]{1,0} parameter(1) bitcast.86.8...
Test CL, ignore
Hi OpenXLA community, I have question about the code to generate sharding strategies and compute resharding costs for HLO `reshape` op. In the code, the `reshape` sharding strategy`output_sepc` is generated...
Automated Code Change
Reverts changelist 632818524