xla icon indicating copy to clipboard operation
xla copied to clipboard

A machine learning compiler for GPUs, CPUs, and ML accelerators

Results 653 xla issues
Sort by recently updated
recently updated
newest added

Update autotuner to filter out "Cublas_fission" backends. - This fixes the xla_gpu_cublas_fallback flag behavior.

Refactor: Use `std::align_val_t` for aligned allocation functions. This change updates `AlignedMalloc`, `AlignedSizedFree`, and `AlignedAllocator` to use `std::align_val_t` for specifying alignment, aligning with standard C++ practices for overaligned allocation. Deprecated inline...

Migrate memory_space_assignment_test_base to PjRt.

Include compilation environment and debug options in split comp. fingerprints.

[XLA:CollectivePipeliner] Fix two issues: 1) Accept transpose as a formatting op in ForwardSink. 2) Do not stop when a large collective was sunk in the previous iteration. Instead, delay sinking...

[XLA:CPU] TargetMachine contains all target information 1. Makes sure features detected in PjRT are used by the CPU compiler 2. Ensures target machine is initialized with requested features This is...

[XLA] Rename TargetConfig to GpuTargetConfig and add CpuTargetConfig to CompilerOptions We need a way to pass target information to the cpu compiler, and TargetConfig seems to fit that purpose.

📝 Summary of Changes All-gathers can only run on the major-most physical dimension - concatenating buffers from ranks. When an all-gather on a logical dimension index > 0 is requested,...