xla issues

refactor OSS Trace proto into Trace metadata and a TraceContainer.

refactor OSS Trace proto into Trace metadata and a TraceContainer. This CL introduces a OSS TraceContainer type that is very similar to the internal TraceEventsContainer. End goal is to either...

copybara-service[bot]

Enroll EagerOperations to DTensor caching.

Enroll EagerOperations to DTensor caching. This works around a leak problem when the same Eager operation is executed multiple times. A more proper fix is to register a notifier_fn when...

copybara-service[bot]

[LatencyHidingScheduler] Add ProfileGuidedLatencyEstimator.

copybara-service[bot]

Add `-windows_excluded` to TF build/test tag filters

Add `-windows_excluded` to TF build/test tag filters Currently, `no_windows` is used to exclude a test from running in the windows environment. However, it is difficult to distinguish between temporary and...

copybara-service[bot]

[LatencyHidingScheduler] Add an option to place host send and send-done as early in the schedule as posssible.

[LatencyHidingScheduler] Add an option to place host send and send-done as early in the schedule as posssible. Controlled by the enable_send_recv_post_process_scheduling scheduler config option.

copybara-service[bot]

Merge C++ and Mesh implementation of most Mesh methods.

Merge C++ and Mesh implementation of most Mesh methods. C++ becomes the source of truth for Mesh. changes are in layout.py, tensor_layout.[cc|h], and the pywrap_dtensor_device.cc file. Many attribute methods are...

copybara-service[bot]

[LatencyHidingScheduler] Add optional pass that moves host send-done to just before the following send.

copybara-service[bot]

DO NOT SUBMIT, testing automated actions

tpopp

Better flag alignment for JIT/AOT paths.

Better flag alignment for JIT/AOT paths. - Hook up the flag for the new deallocator on the JIT path. - Hook up the flag for dumping snapshots on the AOT...

copybara-service[bot]

PR #59936: [NVIDIA XLA] Disable TF32 evaluation for SelfAdjointEigTest cases.

PR #59936: [NVIDIA XLA] Disable TF32 evaluation for SelfAdjointEigTest cases. Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/59936 When run on Ampere and newer GPUs some of the self adjoint eigenvalue test cases...

copybara-service[bot]

xla
xla copied to clipboard

Metadata

refactor OSS Trace proto into Trace metadata and a TraceContainer.

Enroll EagerOperations to DTensor caching.

[LatencyHidingScheduler] Add ProfileGuidedLatencyEstimator.

Add `-windows_excluded` to TF build/test tag filters

[LatencyHidingScheduler] Add an option to place host send and send-done as early in the schedule as posssible.

Merge C++ and Mesh implementation of most Mesh methods.

[LatencyHidingScheduler] Add optional pass that moves host send-done to just before the following send.

DO NOT SUBMIT, testing automated actions

Better flag alignment for JIT/AOT paths.

PR #59936: [NVIDIA XLA] Disable TF32 evaluation for SelfAdjointEigTest cases.

← Metadata

Owner

Metadata

xla xla copied to clipboard

Metadata

← Metadata

Owner

Metadata

xla
xla copied to clipboard