Paul Fultz II

Results 130 issues of Paul Fultz II

For type-erased classes this will now check that the class implements the methods its asking for. Currently it only check the method names and correct parameters, it is still missing...

Currently, in migraphx, we only simplify shape transformation of the same operator such as repeated transpose or reshapes. However, this will simplify across reshape/transpose/broadcast. It produces a much simpler set...

This issue has two parts. The first part is to fuse reductions(including split reductions) with MLIR, including any pointwise. The second part is to use multiple outputs when fusing, to...

Perf Improve

Due to the use of `__syncthreads` in the `reduce` methods registers are not reused. We can reuse them directly by assigning to them with `r.inner([](auto& y, auto x) { y...

Perf Improve

To improve performance for transpose kernels we should load the transposed inputs into LDS directly, and then read from LDS instead. We have function like `preload_copy` which will do this...

Perf Improve
Tier1