xla
xla copied to clipboard
A machine learning compiler for GPUs, CPUs, and ML accelerators
[XLA:CPU] Propagate HloXlaRuntimePipelineOptions for fusion outlining In case of fusion outlining, enable expetimental deallocation and disable sparse bufferization.
Always scalarize thlo.reverse.
[XLA:CPU] Add `xla_cpu_enable_mlir_fusion_outlining` flag Enables fusion outlining into functions. This is to improve compile time.
Also build all xla targets
#tf-data-service Improve error handling for SnapshotManager. If the snapshot manager receives an error from a worker: 1. It writes a StatusProto to an ERROR file. The error status can be...
[XLA] Add int4 types to MHLO translate.
[PJRT C API] Bump up the xla_client version as the signature of make_c_api_client was changed in a previous change.
[PJRT:C] Implement C API version of xla::PjRtChunk.
Add set_to_apply_wo_fusioncheck function to skip fusion check when assigning a to_apply computation.
Add set_to_apply_wo_fusioncheck function to skip fusion check when assigning a to_apply computation.
@sherhut @d0k @jreiffers It would be interesting to benchmark XLA:CPU Next on ARM. I am starting this issue to track the progress and also to share information about the code...