xla issues

[mlir][hlo][sparse] override the inferred dense return type from xla::ConcatInDim when requested a sparse return value.

Fold MapOp to one of its ContractionOp operands

Fold MapOp to one of its ContractionOp operands Given a MapOp that adds a ContractionOp to some other op A, fold the MapOp by making A the init operand of...

copybara-service[bot]

Very slow constant folding of very-large integer arrays, for instance when working with sparse matrices

This relates to the JAX issue [#14655](https://github.com/google/jax/issues/14655): copying in various details from that thread below. I've got a use case where I'd like to store the nonzero entries of a...

aterenin

[XLA:GPU] Remove unecessary upcasts and downcasts just for an add

copybara-service[bot]

[XLA:GPU] Custom kernel for small sum reductions that is intended to run faster than NCCL.

copybara-service[bot]

Use std::optional instead of llvm::Optional

Use std::optional instead of llvm::Optional Note that llvm::Optional is just an alias for std::optional these days and have since been deprecated upstream in favor of std::optional.

copybara-service[bot]

Most latency of weight gradient all-reduce is exposed

2

With the latest implementation of Latency Hiding Scheduling, we observe that most weight gradient all-reduce latency is still exposed. [(ref slide 6 and 7 at here)](https://docs.google.com/presentation/d/1s2B4DPuhOVQbJ4SAZA7XWBKL5ST-Dfcn/edit#slide=id.g1895a52e93e_0_0) Here is a brief...

xrennvidia

xla
xla copied to clipboard

Metadata

[mlir][hlo][sparse] override the inferred dense return type from xla::ConcatInDim when requested a sparse return value.

Fold MapOp to one of its ContractionOp operands

Very slow constant folding of very-large integer arrays, for instance when working with sparse matrices

[XLA:GPU] Remove unecessary upcasts and downcasts just for an add

[XLA:GPU] Custom kernel for small sum reductions that is intended to run faster than NCCL.

Internal CI testing.

Scalarize scatterOp during tiling if the tile_size=1

[XLA:SPMD] Factor out utility functions from SPMD to be used elsewhere.

Use std::optional instead of llvm::Optional

Most latency of weight gradient all-reduce is exposed

← Metadata

Owner

Metadata

xla xla copied to clipboard

Metadata

← Metadata

Owner

Metadata

xla
xla copied to clipboard