xla
xla copied to clipboard
A machine learning compiler for GPUs, CPUs, and ML accelerators
Add some additional args for select_and_scatter_test for certain backends. Also clean up includes for the select_and_scatter_test.
[XLA:MSA] Convert synchronous slices to async.
make presubmit happy for DO_NOT_SUBMIT cl
Remove build_cuda_plugin_from_source in xla CI.
Ensure BFloat16Propagation respects if an instruction does not support mixed precision.
This CL adds a feature in the host instrumentation where tuple outputs are handled correctly. The instrumentation handler was not storing the tuple literals in a way that was compatible...
A minor update to the HloEvaluatorWithSubstitution where the type is updated from Literal to LiteralBase to allow BorrowingLiteral outputs be passed to the function.
[XLA:GPU] Plug xla_gpu.loop into EmitThreadLoop.
[IFRT] Introduce Client::AllocateDevices() and DeviceAllocation `xla::ifrt::Client::AllocateDevices()` is a new API that processes a user request for getting an ordered set of devices that satisfies constraints specified in the request. It...
This CL adds the missing host memory deallocation according to the pointer's host memory space allocated by GpuExecutor::Allocate() .