MIOpen Integrate reduce tensor, reduce extreme and reduce calculation into MIOpen reduce solver

[ ] 1. Reduce extreme (argmin, argmax, min, max etc) enhancement in case of inner dim (https://github.com/ROCm/MIOpen/pull/2766)
[ ] 2. Reduce calculation (sum, prod etc) enhancment in case of inner dim (New PR, 2024.03)
[ ] 3. Change the existing reduce tensor to a form that uses a solver, then integrate reduce tensor, reduce extreme and reduce calculation into MIOpen reduce solver (New PR, 2025)

Feb 29 '24 05:02 seungmanhan

Let's clarify the description of the enhancement. https://github.com/ROCm/MIOpen/pull/2766#issuecomment-1967409613 says:

The ReduceExtreme operations look similar to the existing TensorReduce. What prevents or hinders us from extending the existing TensorReduce with the necessary operations? In other words, why we need to invent a new API?

So the final goal I am expecting to see here is: Converging the ReduceExtreme API with TensorReduce API into a single public API. The preferable way of doing this is an extension of the existing TensorReduce API with ReduceExtreme operations so that it can accommodate the new functions.

Is it feasible what do you think? Or I am totally wrong?

Reworking the internals or tensor ops to the Solver/Solution architecture looks to me as another goal. The very important one (as it adds a lot of flexibility and opens path for further improvements, esp. performance related), but still intermediate.

Do you agree? If everything is clear now, can you please update the description and title of this enhancement.

/cc @JehandadKhan

Feb 29 '24 17:02 atamazov

Let's clarify the description of the enhancement. #2766 (comment) says:

The ReduceExtreme operations look similar to the existing TensorReduce. What prevents or hinders us from extending the existing TensorReduce with the necessary operations? In other words, why we need to invent a new API?

So the final goal I am expecting to see here is: Converging the ReduceExtreme API with TensorReduce API into a single public API. The preferable way of doing this is an extension of the existing TensorReduce API with ReduceExtreme operations so that it can accommodate the new functions.

Is it feasible what do you think? Or I am totally wrong?

Reworking the internals or tensor ops to the Solver/Solution architecture looks to me as another goal. The very important one (as it adds a lot of flexibility and opens path for further improvements, esp. performance related), but still intermediate.

Do you agree? If everything is clear now, can you please update the description and title of this enhancement.

/cc @JehandadKhan

If we use the existing TensorReduce, we will not be able to use isapplicable in the solver to divide priorities according to performance. If the above work is done in the same way in TensorReduce, isn't it the same as creating another solver/Solution architecture?

Mar 05 '24 02:03 seungmanhan

Hi @seungmanhan. Is this ticket still relevant? Thanks!

Apr 29 '25 17:04 ppanchad-amd

1 and 2 have been completed, but 3 doesn't seem to be in progress yet. I'll check the schedule and get back to you.

Apr 30 '25 07:04 seungmanhan

@ppanchad-amd It looks like the reduce tensor is implemented as a solver now, so it seems okay to skip step 3. Thanks for bringing this forgotten issue to my attention.

Apr 30 '25 07:04 seungmanhan

Thanks @seungmanhan. I will go ahead and close this ticket.

Apr 30 '25 14:04 ppanchad-amd