xla issues

Run mobilenet_v2 HLO benchmark on CPU in A/B diff script.

copybara-service[bot]

A/B diff benchmarking

Add PJRT TPU aot device_attributes support to PjRtDeviceTopology.

Add PJRT TPU aot device_attributes support to PjRtDeviceTopology. Also adds hidden APIs that allow compiling AOT.

copybara-service[bot]

[MemorySpaceAssignment] Add a flag to tune async copy start locations for operands greater than a given size.

copybara-service[bot]

Remove Execute and ExecuteWithToken from py_executable.cc

copybara-service[bot]

[XLA:GPU] Remove (now unnecessary) Triton-specific kernel reuse

[XLA:GPU] Remove (now unnecessary) Triton-specific kernel reuse Now we have general fusion kernel reuse, so the Triton specific reuse is not needed anymore. This change should not have any runtime...

copybara-service[bot]

real fix for b/273369126 . (also resolves b/273583026)

real fix for b/273369126 . (also resolves b/273583026) removes the `YieldUnsafeUnsortedEvents()` API, removes `MaybeDropEventsForTraceViewer`, and distribute this logic between `TraceContainer::EventSlice` and the JSON serializer. `TraceContainer` is now a safe ARC...

copybara-service[bot]

[xla:runtime] Add support of passing async values to runtime executable

[xla:runtime] Add support of passing async values to runtime executable This change adds the support of passing async values to runtime executable, ex., ``` async.func @test(%arg0: !async.value, %arg1: i32) ->...

copybara-service[bot]

Make *_hdrs targets only depends on headers

Make *_hdrs targets only depends on headers This prevents API users from accidentially compiling in the implementations.

copybara-service[bot]

Move //third_party/tensorflow/compiler/xla/service:hlo_{cost_analysis, creation_utils, query} and tests to //third_party/tensorflow/compiler/xla/hlo/utils and update all users.

copybara-service[bot]

Improve handling of dynamic shapes in jax native serialization

copybara-service[bot]

xla
xla copied to clipboard

Metadata

Run mobilenet_v2 HLO benchmark on CPU in A/B diff script.

Add PJRT TPU aot device_attributes support to PjRtDeviceTopology.

[MemorySpaceAssignment] Add a flag to tune async copy start locations for operands greater than a given size.

Remove Execute and ExecuteWithToken from py_executable.cc

[XLA:GPU] Remove (now unnecessary) Triton-specific kernel reuse

real fix for b/273369126 . (also resolves b/273583026)

[xla:runtime] Add support of passing async values to runtime executable

Make *_hdrs targets only depends on headers

Move //third_party/tensorflow/compiler/xla/service:hlo_{cost_analysis, creation_utils, query} and tests to //third_party/tensorflow/compiler/xla/hlo/utils and update all users.

Improve handling of dynamic shapes in jax native serialization

← Metadata

Owner

Metadata

xla xla copied to clipboard

Metadata

← Metadata

Owner

Metadata

xla
xla copied to clipboard