xla icon indicating copy to clipboard operation
xla copied to clipboard

A machine learning compiler for GPUs, CPUs, and ML accelerators

Results 553 xla issues
Sort by recently updated
recently updated
newest added

refactor OSS Trace proto into Trace metadata and a TraceContainer. This CL introduces a OSS TraceContainer type that is very similar to the internal TraceEventsContainer. End goal is to either...

Enroll EagerOperations to DTensor caching. This works around a leak problem when the same Eager operation is executed multiple times. A more proper fix is to register a notifier_fn when...

[LatencyHidingScheduler] Add ProfileGuidedLatencyEstimator.

Add `-windows_excluded` to TF build/test tag filters Currently, `no_windows` is used to exclude a test from running in the windows environment. However, it is difficult to distinguish between temporary and...

[LatencyHidingScheduler] Add an option to place host send and send-done as early in the schedule as posssible. Controlled by the enable_send_recv_post_process_scheduling scheduler config option.

Merge C++ and Mesh implementation of most Mesh methods. C++ becomes the source of truth for Mesh. changes are in layout.py, tensor_layout.[cc|h], and the pywrap_dtensor_device.cc file. Many attribute methods are...

[LatencyHidingScheduler] Add optional pass that moves host send-done to just before the following send.

Better flag alignment for JIT/AOT paths. - Hook up the flag for the new deallocator on the JIT path. - Hook up the flag for dumping snapshots on the AOT...

PR #59936: [NVIDIA XLA] Disable TF32 evaluation for SelfAdjointEigTest cases. Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/59936 When run on Ampere and newer GPUs some of the self adjoint eigenvalue test cases...