xla icon indicating copy to clipboard operation
xla copied to clipboard

A machine learning compiler for GPUs, CPUs, and ML accelerators

Results 653 xla issues
Sort by recently updated
recently updated
newest added

To enable ReLU epilogue fusion for CublasLt matmul for training, 2 pair of epilogues: (RELU_AUX, DRELU) and (BIAS_RELU_AUX, DRELU_BGRAD) are added. The RELU_AUX(or BIAS_RELU_AUX) epilogue for the forward matmul outputs...

kokoro:force-run

Sink broadcast(constant) into while body. It is possible to sink the initialization broadcast into the while body and replace it with a free allocate-buffer custom-call if the entire shape of...

[xla:cpu] Don't forget to release SimpleOrcJit resources after done with compiling

I have found in some models that have poor SPMD partitioning the below pattern. ``` all-gather.1 = all-gather(x) dot.1 = dot(all-gather.1, y) dynamic-slice.1 = dynamic-slice(all-gather.1) // can be cancelled ```...